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TITLE OF INVENTION 

CHLAMYDIA ANTIGENS AND CORRESPONDING DNA FRAGMENTS AND USES 
THEREOF 

5 REFERENCE TO RELATED APPLICATIONS 

This application claims the benefit of U.S. 
Provisional Application No. 60/106034, filed October 28, 1998, 
U.S. Provisional Application No . 60/106039, filed October 28, 
1998, U.S. Provisional Application No. 60/106042, filed October 

10 28, 1998, U.S. Provisional Application No. 60/106044, filed 
October 28, 1998, U.S. Provisional Application No. 60/106072, 
filed October 29, 1998, U.S. Provisional Application No. 
60/106073, filed October 29, 1998, U.S. Provisional Application 
No. 60/106074, filed October 29, 1998, U.S. Provisional 

15 Application No. 60/106087, filed October 29, 1998, U.S. 

Provisional Application No. 60/106587, filed November 2, 1998, 
U.S. Provisional Application No. 60/106588, filed November 2, 
1998, U.S. Provisional Application No. 60/107089, filed November 
2, 1998, U.S. Provisional Application No. 60/107034, filed 

20 November 2, 1998 and U.S. Provisional Application No. 60/107035, 
filed November 2, 1998. 

FIELD OF INVENTION 

The present invention relates to Chlamydia antigens 
25 and corresponding DNA molecules, which can be used to prevent 
and treat Chlamydia infection in mammals, such as humans. 

BACKGROUND OF THE INVENTION 

Chlamydiae are prokaryotes . They exhibit morphologic 
30 and structural similarities to gram-negative bacteria including 
a trilaminar outer membrane, which contains lipopolysaccharide 
and several membrane proteins that are structurally and 
functionally analogous to proteins found in E coli. They are 
obligate intra-cellular parasites with a unique biphasic life 
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cycle consisting of a metabolically inactive but infectious 
extracellular stage and a replicating but non-infectious 
intracellular stage. The replicative stage of the life-cycle 
takes place within a membrane-bound inclusion which sequesters 
5 the bacteria away from the cytoplasm of the infected host cell. 

C. pneumoniae is a common human pathogen, originally 
described as the TWAR strain of Chlamydia psittaci but 
subsequently recognised to be a new species. C. pneumoniae is 
antigenically, genetically and morphologically distinct from 
10 other chlamydia species (C. trachomatis, C. pecorum and C. 
psittaci) . It shows 10% or less DNA sequence homology with 
either of C. trachomatis or C. psittaci. 

C. pneumoniae is a common cause of community acquired 
pneumonia, only less frequent than Streptococcus pneumoniae and 
15 Mycoplasma pneumoniae (Grayston et al. (1995) Journal of 

Infectious Diseases 168:1231; Campos et al. (1995) Investigation 
of Ophthalmology and Visual Science 36:1477). It can also cause 
upper respiratory tract symptoms and disease, including 
bronchitis and sinusitis (Grayston et al. (1995) Journal of 
20 Infectious Diseases 168:1231; Grayston et al (1990) Journal of 
Infectious Diseases 161:618; Marrie (1993) Clinical Infectious 
Diseases. 18:501; Wang et al (1986) Chlamydial infections). 
Cambridge University Press, Cambridge, p. 329The great majority 
of the adult population (over 60%) has antibodies to C. 
25 pneumoniae (Wang et al (1986) Chlamydial infections. Cambridge 
University Press, Cambridge, p. 329), indicating past infection 
which was unrecognized or asymptomatic. 

C. pneumoniae infection usually presents as an acute 
respiratory disease (i.e., cough, sore throat, hoarseness, and 
30 fever; abnormal chest sounds on auscultation) . For most 

patients, the cough persists for 2 to 6 weeks, and recovery is 
slow. In approximately 10% of these cases, upper respiratory 
tract infection is followed by bronchitis or pneumonia. 
Furthermore, during a C. pneumoniae epidemic, subsequent 
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co-infection with pneumococcus has been noted in about half of 
these pneumonia patients, particularly in the infirm and the 
elderly. As noted above, there is more and more evidence that 
C. pneumoniae infection is also linked to diseases other than 
5 respiratory infections. 

The reservoir for the organism is presumably people. 
In contrast to C. psittaci infections, there is no known bird or 
animal reservoir. Transmission has not been clearly defined. It 
may result from direct contact with secretions, from fomites, or 

10 from airborne spread. There is a long incubation period, which 
may last for many months. Based on analysis of epidemics, C. 
pneumoniae appears to spread slowly through a population (case- 
to-case interval averaging 30 days) because infected persons are 
inefficient transmitters of the organism. Susceptibility to C. 

15 pneumoniae is universal. Reinfections occur during adulthood, 
following the primary infection as a child. C. pneumoniae 
appears to be an endemic disease throughout the world, 
noteworthy for superimposed intervals of increased incidence 
(epidemics) that persist for 2 to 3 years. C. trachomatis 

20 infection does not confer cross-immunity to C. pneumoniae. 
Infections are easily treated with oral antibiotics, 
tetracycline or erythromycin (2 g/d, for at least 10 to 14 d) . A 
recently developed drug, azithromycin, is highly effective as a 
single-dose therapy against chlamydial infections. 

25 In most instances, C. pneumoniae infection is often 

mild and without complications, and up to 90% of infections are 
subacute or unrecognized. Among children in industrialized 
countries, infections have been thought to be rare up to the age 
of 5 y, although a recent study (E Normann et al, Chlamydia 

30 pneumoniae in children with acute respiratory tract infections, 
Acta Paediatrica, 1998, Vol 87, Iss 1, pp 23-27) has reported 
that many children in this age group show PCR evidence of 
infection despite being seronegative, and estimates a prevalence 
of 17-19% in 2-4 y olds. In developing countries, the 
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seroprevalence of C. pneumoniae antibodies among young children 
is elevated, and there are suspicions that C. pneumoniae may be 
an important cause of acute lower respiratory tract disease and 
mortality for infants and children in tropical regions of the 
5 world. 

From seroprevalence studies and studies of local 
epidemics, the initial C. pneumoniae infection usually happens 
between the ages of 5 and 20 y. In the USA, for example, there 
are estimated to be 30,000 cases of childhood pneumonia each 

10 year caused by C. pneumoniae. Infections may cluster among 
groups of children or young adults (e.g., school pupils or 
military conscripts) . 

C. pneumoniae causes 10 to 25% of community-acquired 
lower respiratory tract infections (as reported from Sweden, 

15 Italy, Finland, and the USA) . During an epidemic, C. pneumonia 
infection may account for 50 to 60% of the cases of pneumonia. 
During these periods, also, more episodes of mixed infections 
with S. pneumoniae have been reported. 
Reinfection during adulthood is common; the clinical 

20 presentation tends to be milder. Based on population 

seroprevalence studies, there tends to be increased exposure 
with age, which is particularly evident among men. Some 
investigators have speculated that a persistent, asymptomatic C. 
pneumoniae infection state is common. 

25 In adults of middle age or older, C. pneumoniae 

infection may progress to chronic bronchitis and sinusitis. A 
study in the USA revealed that the incidence of pneumonia caused 
by C. pneumoniae in persons younger than 60 years is 1 case per 
1,000 persons per year; but in the elderly, the disease 

30 incidence rose three-fold. C. pneumoniae infection rarely leads 
to hospitalization, except in patients with an underlying 
illness . 

Of considerable importance is the association of 
atherosclerosis and C. pneumoniae infection. There are several 
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epidemiological studies showing a correlation of previous 
infections with C. pneumoniae and heart attacks, coronary artery 
and carotid artery disease (Saikku et al. (1988) Lancet; ii : 983; 
Thorn et al. (1992) JAMA 268:68; Linnanmaki et al. (1993), 
5 Circulation 87:1030; Saikku et al. (1992) Annals Internal 
Medicine 116:273; Melnick et al (1993) American Journal of 
Medicine 95:499). Moreover, the organisms has been detected in 
atheromas and fatty streaks of the coronary, carotid, peripheral 
arteries and aorta (Shor et al. (1992) South African. Medical 

10 Journal 82:158; Kuo et al. (1993) Journal of Infectious Diseases 
167:841; Kuo et al. (1993) Arteriosclerosis and Thrombosis 
13:1500; Campbell et al (1995) Journal of Infectious Diseases 
172:585; Chiu et al. Circulation, 1997 (In Press)). Viable C. 
pneumoniae has been recovered from the coronary and carotid 

15 artery (Ramirez et al (1996) Annals of Internal Medicine 

125:979; Jackson et al. Abst. K121, p272, 36 th ICAAC, 15-18 
Sept. 1996, New Orleans). Furthermore, it has been shown that 
C. pneumoniae can induce changes of atherosclerosis in a rabbit 
model (Fong et al (1997) Journal of Clinical Microbiolology 

20 35:48). Taken together, these results indicate that it is 

highly probable that C. pneumoniae can cause atherosclerosis in 
humans, though the epidemiological importance of chlamydial 
atherosclerosis remains to be demonstrated. 

A number of recent studies have also indicated an 

25 association between C. pneumoniae infection and asthma. 

Infection has been linked to wheezing, asthmatic bronchitis, 
adult-onset asthma and acute exacerbations of asthma in adults, 
and small-scale studies have shown that prolonged antibiotic 
treatment was effective at greatly reducing the severity of the 

30 disease in some individuals (Hahn DL, et al. Evidence for 
Chlamydia pneumoniae infection in steroid-dependent asthma. 
Ann Allergy Asthma Immunol. 1998 Jan; 80(1): 45-49.; Hahn DL, et 
al. Association of Chlamydia pneumoniae IgA antibodies with 
recently symptomatic asthma. Epidemiol Infect. 1996 Dec; 
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117(3): 513-517; Bjornsson E, et al. Serology of chlamydia in 
relation to asthma and bronchial hyperresponsiveness . Scand J 
Infect Dis. 1996; 28(1): 63-69.; Hahn DL. Treatment of Chlamydia 
pneumoniae infection in adult asthma: a before-after trial. J 
5 Fam Pract. 1995 Oct; 41(4): 345-351.; Allegra L, et al. Acute 
exacerbations of asthma in adults: role of Chlamydia pneumoniae 
infection. Eur Respir J. 1994 Dec; 7(12): 2165-2168.; Hahn DL, 
et al. Association of Chlamydia pneumoniae (strain TWAR) 
infection with wheezing, asthmatic bronchitis, and adult-onset 

10 asthma. JAMA. 1991 Jul 10; 266(2): 225-230). 

In light of these results a protective vaccine against 
C. pneumoniae infection would be of considerable importance. 
There is not yet an effective vaccine for any human chlamydial 
infection. It is conceivable that an effective vaccine can be 

15 developed using physically or chemically inactivated Chlamydiae. 
However, such a vaccine does not have a high margin of safety. 
In general, safer vaccines are made by genetically manipulating 
the organism by attenuation or by recombinant means. 
Accordingly, a major obstacle in creating an effective and safe 

20 vaccine against human chlamydial infection has been the paucity 
of genetic information regarding Chlamydia, specifically C. 
pneumoniae. 

Studies with C. trachomatis and C. psittaci indicate 
that safe and effective vaccine against Chlamydia is an 

25 attainable goal. For example, mice which have recovered from a 
lung infection with C. trachomatis are protected from 
infertility induced by a subsequent vaginal challenge (Pal et 
al. (1996) Infection and Immunity . 64 : 5341) . Similarly, sheep 
immunized with inactivated C. psittaci were protected from 

30 subsequent chlamydial-induced abortions and stillbirths (Jones 
et al. (1995) Vaccine 13:715). Protection from chlamydial 
infections has been associated with Thl immune responses, 
particularly the induction of INFg - producing CD4+T-cells 
(Igietsemes et al. (1993) Immunology 5:317). The adoptive 
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transfer of CD4 + cell lines or clones to nude or SCID mice 
conferred protection from challenge or cleared chronic disease 
(Igietseme et al (1993) Regional Immunology 5:317; Magee et al 
(1993) Regional Immunology 5: 305), and in vivo depletion of 
5 CD4 + T cells exacerbated disease post-challenge (Landers et al 
(1991) Infection & Immunity 59:3774; Magee et al (1995) 
Infection & Immunity 63:516). However, the presence of 
sufficiently high titres of neutralising antibody at mucosal 
surfaces can also exert a protective effect (Cotter et al. 

10 (1995) Infection and Immunity 63:4704). 

Antigenic variation within the species C. pneumoniae 
is not well documented due to insufficient genetic information, 
though variation is expected to exist based on C. trachomatis . 
Serovars of C. trachomatis are defined on the basis of antigenic 

15 variation in MOMP, but published C. pneumoniae MOMP gene 

sequences show no variation between several diverse isolates of 
the organism (Campbell et al (1990) Infection and Immunity 
58:93; McCafferty et al (1995) Infection and Immunity 63:2387-9; 
Knudsen et al (1996) Third Meeting of the European Society for 

20 Chlamydia Research, Vienna) . Regions of the protein known to be 
conserved in other chlamydial MOMPs are conserved in C. 
pneumoniae (Campbell et al (1990) Infection and Immunity 58:93; 
McCafferty et al (1995) Infection and Immunity 63:2387-9). One 
study has described a strain of C. pneumoniae with a MOMP of 

25 greater that usual molecular weight, but the gene for this has 
not been sequenced (Grayston et al. (1995) Journal of Infectious 
Diseases 168:1231). Partial sequences of outer membrane protein 
2 from nine diverse isolates were also found to be invariant 
(Ramirez et al (1996) Annals of Internal Medicine 125:979). The 

30 genes for HSP60 and HSP70 show little variation from other 

chlamydial species, as would be expected. The gene encoding a 
76kDa antigen has been cloned from a single strain of C. 
pneumoniae. It has no significant similarity with other known 
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chlamydial genes (Marrie (1993) Clinical Infectious Diseases. 
18:501) . 

Many antigens recognised by immune sera to C. 
pneumoniae are conserved across all chlamydiae, but 98kDa, 76 
5 kDa and 54 kDa proteins appear to be C. pneumoniae-specif ic (Ref 
Campos et al. (1995) Investigation of Ophthalmology and Visual 
Science 36:1477; Marrie (1993) Clinical Infectious Diseases. 
18:501; Wiedmann-Al-Ahmad M, et al. Reactions of polyclonal and 
neutralizing anti-p54 monoclonal antibodies with an isolated, 

10 species-specific 54-kilodalton protein of Chlamydia pneumoniae. 
Clin Diagn Lab Immunol. 1997 Nov; 4(6): 700-704). 
Immunoblotting of isolates with sera from patients does show 
variation of blotting patterns between isolates, indicating that 
serotypes C. pneumoniae may exist (Ref 1,16). However, the 

15 results are potentially confounded by the infection status of 
the patients, since immunoblot profiles of a patient's sera 
change with time post-infection. An assessment of the number 
and relative frequency of any serotypes, and the defining 
antigens, is not yet possible. 

20 Accordingly, a need exists for identifying and 

isolating polynucleotide sequences of C. pneumoniae for use in 
preventing and treating Chlamydia infection. 

SUMMARY OF THE INVENTION 

25 The present invention provides purified and isolated 

polynucleotide molecules that encode Chlamydia polypeptides 
which can be used in methods to prevent, treat, and diagnose 
Chlamydia infection. In one form of the invention, the 
polynucleotide molecules are selected from DNA that encode 

30 polypeptides CPN100397 (SEQ ID Nos: 1 and 2), CPN100421 (SEQ ID 
Nos: 3 and 4), CPN100422 (SEQ ID Nos: 5 and 6), CPN100424 (SEQ 
ID Nos: 7 and 8), CPN100426 (SEQ ID Nos: 9 and 10), CPN100508 
(SEQ ID Nos: 11 and 12), CPN100515 (SEQ ID Nos: 13 and 14), 
CPN100538 (SEQ ID Nos: 15 and 16), CPN100557 (SEQ ID Nos: 17 and 
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18), CPN100622 (SEQ ID Nos : 19 and 20), CPN100626 (SEQ ID Nos : 
21 and, 22), CPN100628 (SEQ ID Nos: 23 and 24) and CPN100630 
(SEQ ID Nos: 25 and 26) . 

Another form of the invention provides polypeptides 
5 corresponding to the isolated DNA molecules. The amino acid 
sequences of the corresponding encoded polypeptides are shown 
for CPN100397 as SEQ ID Nos: 27 and 28, CPN100421 as SEQ ID No: 
29, CPN100422 as SEQ ID No: 30, CPN100424 as SEQ ID No: 31, 
CPN100426 as SEQ ID No: 32, CPN100508 as SEQ ID Nos: 33 and 34, 

10 CPN100515 as SEQ ID Nos: 35 and 36, CPN100538 as SEQ ID No: 37, 
CPN100557 as SEQ ID Nos: 38 and 39, CPN100622 as SEQ ID Nos: 40 
and 41, CPN100626 as SEQ ID No: 42, CPN100628 as SEQ ID No: 43 
and CPN100630 as SEQ ID Nos: 44 and 45. 

Those skilled in the art will readily understand that the 

15 invention, having provided the polynucleotide sequences encoding 
Chlamydia polypeptides, also provides polynucleotides encoding 
fragments derived from such peptides. Moreover, the invention 
is understood to provide mutants and derivatives of such 
polypeptides and fragments derived therefrom, which result from 

20 the addition, deletion, or substitution of non-essential amino 
acids as described herein. Those skilled in the art would also 
readily understand that the invention, having provided the 
polynucleotide sequences encoding Chlamydia polypeptides, 
further provides monospecific antibodies that specifically bind 

25 to such polypeptides 

The present invention has wide application and includes 
expression cassettes, vectors, and cells transformed or 
transfected with the polynucleotides of the invention. 
Accordingly, the present invention further provides (i) a method 

30 for producing a polypeptide of the invention in a recombinant 
host system and related expression cassettes, vectors, and 
transformed or transfected cells; (ii) a vaccine, or a live 
vaccine vector such as a pox virus, Salmonella typhimurium, or 
Vibrio cholerae vector, containing a polynucleotide of the 
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invention, such vaccines and vaccine vectors being useful for, 
e.g., preventing and treating Chlamydia infection, in 
combination with a diluent or carrier, and related 
pharmaceutical compositions and associated therapeutic and/or 
5 prophylactic methods; (iii) a therapeutic and/or prophylactic 
use of an RNA or DNA molecule of the invention, either in a 
naked form or formulated with a delivery vehicle, a polypeptide 
or combination of polypeptides, or a monospecific antibody of 
the invention, and related pharmaceutical compositions; (iv) a 
10 method for diagnosing the presence of Chlamydia in a biological 
sample, which can involve the use of a DNA or RNA molecule, a 
monospecific antibody, or a polypeptide of the invention; and 
(v) a method for purifying a polypeptide of the invention by 
antibody-based affinity chromatography. 

15 

BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention will be further understood from the 
following description with reference to the drawings, in which: 

Figure 1 shows the nucleotide sequence of the CPN100397 
20 (SEQ ID No: 1 - entire sequence and SEQ ID No: 2 - coding 

sequence) and the deduced amino acid sequence of the CPN100397 
protein from Chlamydia pneumoniae (SEQ ID No: 27 and 28) . 

Figure 2 shows the restriction enzyme analysis of the 
gene encoding the C. pneumoniae CPN100397 gene. 
25 Figure 3 shows the nucleotide sequence of the CPN100421 

(SEQ ID No: 3 - entire sequence and SEQ ID No: 4 - coding 
sequence) and the deduced amino acid sequence of the CPN100421 
protein from Chlamydia pneumoniae (SEQ ID No: 29) . 

Figure 4 shows the restriction enzyme analysis of the 
30 gene encoding the C. pneumoniae CPN100421 gene. 

Figure 5 shows the nucleotide sequence of the CPN100422 
(SEQ ID No: 5 - entire sequence and SEQ ID No: 6 - coding 
sequence) and the deduced amino acid sequence of the CPN100422 
protein from Chlamydia pneumoniae (SEQ ID No: 30) . 
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Figure 6 shows the restriction enzyme analysis of the 
gene encoding the C. pneumoniae CPN100422 gene. 

Figure 7 shows the nucleotide sequence of the CPN100424 
(SEQ ID No: 7 - entire sequence and SEQ ID No: 8 - coding 
5 sequence) and the deduced amino acid sequence of the CPN100424 
protein from Chlamydia pneumoniae (SEQ ID No: 31) . 

Figure 8 shows the restriction enzyme analysis of the 
gene encoding the C. pneumoniae CPN100424 gene. 

Figure 9 shows the nucleotide sequence of the CPN100426 
10 (SEQ ID No: 9 - entire sequence and SEQ ID No: 10 - coding 

sequence) and the deduced amino acid sequence of the CPN100426 
protein from Chlamydia pneumoniae (SEQ ID No: 32) . 

Figure 10 shows the restriction enzyme analysis of the 
gene encoding the C. pneumoniae CPN100426 gene. 
15 Figure 11 shows the nucleotide sequence of the CPN100508 

(SEQ ID No: 11 - entire sequence and SEQ ID No: 12 - coding 
sequence) and the deduced amino acid sequence of the 
CPN100508protein from Chlamydia pneumoniae (SEQ ID No: 33 - full 
length sequence and SEQ ID No: 34 - processed sequence). 
20 Figure 12 shows the restriction enzyme analysis of the 

gene encoding the C. pneumoniae CPN100508 gene. 

Figure 13 shows the nucleotide sequence of the CPN100515 
(SEQ ID No: 13 - entire sequence and SEQ ID No: 14 - coding 
sequence) and the deduced amino acid sequence of the CPN100515 
25 protein from Chlamydia pneumoniae (SEQ ID No: 35 - full length 
sequence and SEQ ID No: 36 - processed sequence) . 

Figure 14 shows the restriction enzyme analysis of the 
gene encoding the C. pneumoniae CPN100515 gene. 

Figure 15 shows the nucleotide sequence of the CPN100538 
30 (SEQ ID No: 15 - entire sequence and SEQ ID No: 16 - coding 
sequence) and the deduced amino acid sequence of the CPN100538 
protein from Chlamydia pneumoniae (SEQ ID No: 37) . 

Figure 16 shows the restriction enzyme analysis of the 
gene encoding the C. pneumoniae CPN100538 gene. 
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Figure 17 shows the nucleotide sequence of the CPN100557 
(SEQ ID No: 17 - entire sequence and SEQ ID No: 18 - coding 
sequence) and the deduced amino acid sequence of the CPN100557 
protein from Chlamydia pneumoniae (SEQ ID No: 38 - full length 
5 sequence and SEQ ID No: 39 - processed sequence) . 

Figure 18 shows the restriction enzyme analysis of the 
gene encoding the C. pneumoniae CPN100557 gene. 

Figure 19 shows the nucleotide sequence of the CPN100622 
(SEQ ID No: 19 - entire sequence and SEQ ID No: 20 - coding 
10 sequence) and the deduced amino acid sequence of the CPN100622 
protein from Chlamydia pneumoniae (SEQ ID No: 40 - full length 
sequence and SEQ ID No: 41 - processed sequence) . 

Figure 20 shows the restriction enzyme analysis of the 
gene encoding the C. pneumoniae CPN100622 gene. 
15 Figure 21 shows the nucleotide sequence of the CPN100626 

(SEQ ID No: 21 - entire sequence and SEQ ID No: 22 - coding 
sequence) and the deduced amino acid sequence of the CPN100626 
protein from Chlamydia pneumoniae (SEQ ID No: 42) . 

Figure 22 shows the restriction enzyme analysis of the 
20 gene encoding the C. pneumoniae CPN100626 gene. 

Figure 23 shows the nucleotide sequence of the CPN100628 
(SEQ ID No: 23 - entire sequence and SEQ ID No: 24 - coding 
sequence) and the deduced amino acid sequence of the CPN100628 
protein from Chlamydia pneumoniae (SEQ ID No: 43) . 
25 Figure 24 shows the restriction enzyme analysis of the 

gene encoding the C. pneumoniae CPN100628 gene. 

Figure 25 shows the nucleotide sequence of the CPN100630 
(SEQ ID No: 25 - entire sequence and SEQ ID No: 26 - coding 
sequence) and the deduced amino acid sequence of the CPN100630 
30 protein from Chlamydia pneumoniae (SEQ ID No: 44 - full length 
sequence and SEQ ID No: 45 - processed sequence) . 

Figure 26 shows the restriction enzyme analysis of the 
gene encoding the C. pneumoniae CPN100630 gene. 
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Figures 27 through 39 show an identification of T and B 
cell epitopes from the amino acid sequences shown in the 
foregoing figures. 

5 DETAILED DESCRIPTION OF INVENTION 

Open reading frames (ORFs) encoding chlamydial 
polypeptides have been identified from the C. pneumoniae genome. 
These polypeptides include polypeptides found permanently in 
the bacterial membrane structure, polypeptides present in the 

10 external vicinity of the bacterial membrane, polypeptides found 
permanently in the inclusion membrane structure, polypeptides 
present in the external vicinity of the inclusion membrane, and 
polypeptides released into the cytoplasm of the infected cell. 
These polypeptides can be used to prevent and treat Chlamydia 

15 infection. 

According to a first aspect of the invention, isolated 
polynucleotides are provided which encode the precursor and 
mature forms of Chlamydia polypeptides, whose amino acid 
sequences are selected from the group consisting of: SEQ ID 

20 Nos: 27 to 45. 

The term "isolated polynucleotide" is defined as a 
polynucleotide removed from the environment in which it 
naturally occurs. For example, a naturally-occurring DNA 
molecule present in the genome of a living bacteria or as part 

25 of a gene bank is not isolated, but the same molecule separated 
from the remaining part of the bacterial genome, as a result of, 
e.g., a cloning event (amplification), is isolated. Typically, 
an isolated DNA molecule is free from DNA regions (e.g., coding 
regions) with which it is immediately contiguous at the 5' or 3' 

30 end, in the naturally occurring genome. Such isolated 

polynucleotides may be part of a vector or a composition and 
still be defined as isolated in that such a vector or 
composition is not part of the natural environment of such 
polynucleotide . 
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The polynucleotide of the invention is either RNA or DNA 
(cDNA, genomic DNA, or synthetic DNA), or modifications, 
variants, homologs or fragments thereof. The DNA is either 
double-stranded or single-stranded, and, if single-stranded, is 
5 either the coding strand or the non-coding (anti-sense) strand. 
Any one of the sequences that encode the polypeptides of the 
invention as shown in SEQ ID Nos : 1 to 26 is (a) a coding 
sequence, (b) a ribonucleotide sequence derived from 
transcription of (a) , or (c) a coding sequence which uses the 

10 redundancy or degeneracy of the genetic code to encode the same 
polypeptides. By "polypeptide" or "protein" is meant any chain 
of amino acids, regardless of length or post-translational 
modification (e.g., glycosylation or phosphorylation). Both 
terms are used interchangeably in the present application. 

15 Consistent with the first aspect of the invention, amino 

acid sequences are provided which are homologous to any one of 
SEQ ID Nos: 27 to 45. As used herein, "homologous amino acid 
sequence" is any polypeptide which is encoded, in whole or in 
part, by a nucleic acid sequence which hybridizes at 25-35°C 

20 below critical melting temperature (Tm) , to any portion of the 
nucleic acid sequences of SEQ ID Nos: 1 to 26. A homologous 
amino acid sequence is one that differs from an amino acid 
sequence shown in any one of SEQ ID Nos: 27 to 45 by one or 
more amino acid substitutions. Such a sequence also encompass 

25 serotypic variants (defined below) as well as sequences 
containing deletions or insertions which retain inherent 
characteristics of the polypeptide such as immunogenicity . 
Preferably, such a sequence is at least 75%, more preferably 
80%, and most preferably 90% identical to any one of SEQ ID 

30 Nos: 27 to 45. Homologous amino acid sequences include 

sequences that are identical or substantially identical to SEQ 
ID Nos: 27 to 45. By "amino acid sequence substantially 
identical" is meant a sequence that is at least 90%, preferably 
95%, more preferably 97%, and most preferably 99% identical to 
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an amino acid sequence of reference and that preferably differs 
from the sequence of reference by a majority of conservative 
amino acid substitutions. 

Conservative amino acid substitutions are substitutions 
5 among amino acids of the same class. These classes include, for 
example, amino acids having uncharged polar side chains, such as 
asparagine, glutamine, serine, threonine, and tyrosine; amino 
acids having basic side chains, such as lysine, arginine, and 
histidine; amino acids having acidic side chains, such as 
10 aspartic acid and glutamic acid; and amino acids having nonpolar 
side chains, such as glycine, alanine, valine, leucine, 
isoleucine, proline, phenylalanine, methionine, tryptophan, and 
cysteine . 

Homology is measured using sequence analysis software 

15 such as Sequence Analysis Software Package of the Genetics 
Computer Group, University of Wisconsin Biotechnology Center, 
1710 University Avenue, Madison, WI 53705. Amino acid sequences 
are aligned to maximize identity. Gaps may be artificially 
introduced into the sequence to attain proper alignment. Once 

20 the optimal alignment has been set up, the degree of homology is 
established by recording all of the positions in which the amino 
acids of both sequences are identical, relative to the total 
number of positions. 

Homologous polynucleotide sequences are defined in a 

25 similar way. Preferably, a homologous sequence is one that is 
at least 45%, more preferably 60%, and most preferably 85% 
identical to any one of coding sequences SEQ ID Nos: 1 to 26. 

Consistent with the first aspect of the invention, 
polypeptides having a sequence homologous to any one of SEQ ID 

30 Nos: 27 to 45 include naturally-occurring allelic variants, as 
well as mutants or any other non-naturally occurring variants 
that retain the inherent characteristics of the polypeptide of 
SEQ ID Nos: 27 to 45. 
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As is known in the art, an allelic variant is an 
alternate form of a polypeptide that is characterized as having 
a substitution, deletion, or addition of one or more amino acids 
that does not alter the biological function of the polypeptide. 
5 By "biological function" is meant the function of the 

polypeptide in the cells in which it naturally occurs, even if 
the function is not necessary for the growth or survival of the 
cells. For example, the biological function of a porin is to 
allow the entry into cells of compounds present in the 

10 extracellular medium. Biological function is distinct from 
antigenic property. A polypeptide can have more than one 
biological function. 

Allelic variants are very common in nature. For example, 
a bacterial species such as C. pneumoniae , is usually 

15 represented by a variety of strains that differ from each other 
by minor allelic variations. Indeed, a polypeptide that 
fulfills the same biological function in different strains can 
have an amino acid sequence (and polynucleotide sequence) that 
are not identical in each of the strains. Despite this 

20 variation, an immune response directed generally against many 
allelic variants has been demonstrated. In studies of the 
Chlamydial MOMP antigen, cross-strain antibody binding plus 
neutralization of infectivity occurs despite amino acid sequence 
variation of MOMP from strain to strain, indicating that the 

25 MOMP, when used as an immunogen, is tolerant of amino acid 
variations . 

Polynucleotides encoding homologous polypeptides or 
allelic variants are retrieved by polymerase chain reaction 
(PCR) amplification of genomic bacterial DNA extracted by 
30 conventional methods. This involves the use of synthetic 

oligonucleotide primers matching upstream and downstream of the 
5' and 3' ends of the encoding domain. Suitable primers are 
designed according to the nucleotide sequence information 
provided in SEQ ID Nos : 1 to 26. The procedure is as follows: a 
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primer is selected which consists of 10 to 40, preferably 15 to 
25 nucleotides. It is advantageous to select primers containing 
C and G nucleotides in a proportion sufficient to ensure 
efficient hybridization; i.e., an amount of C and G nucleotides 
5 of at least 40%, preferably 50% of the total nucleotide content. 
An alternative method for retrieving polynucleotides 
encoding homologous polypeptides or allelic variants is by 
hybridization screening of a DNA or RNA library. Hybridization 
procedures are well-known in the art and are described in 

10 Ausubel et al., (Ref 41), Silhavy et al. (Ref 43), and Davis et 
al. (ref 44) . Important parameters for optimizing hybridization 
conditions are reflected in a formula used to obtain the 
critical melting temperature above which two complementary DNA 
strands separate from each other (Ref 45) . For polynucleotides 

15 of about 600 nucleotides or larger, this formula is as follows: 
Tm = 81.5 + 0.5 x (% G+C) + 1.6 log (positive ion concentration) 
- 0.6 x (% formamide) . Under appropriate stringency conditions, 
hybridization temperature (Th) is approximately 20 to 40°C, 20 
to 25°C, or, preferably 30 to 40°C below the calculated Tm. 

20 Those skilled in the art will understand that optimal 

temperature and salt conditions can be readily determined. 

For the polynucleotides of the invention, stringent 
conditions are achieved for both pre-hybridizing and hybridizing 
incubations (i) within 4-16 hours at 42°C, in 6 x SSC containing 

25 50% formamide, or (ii) within 4-16 hours at 65°C in an aqueous 
6 x SSC solution (1 M NaCl, 0.1 M sodium citrate (pH 7.0)). 

Useful homologs and fragments thereof that do not occur 
naturally are designed using known methods for identifying 
regions of an antigen that are likely to tolerate amino acid 

30 sequence changes and/or deletions. As an example, homologous 
polypeptides from different species are compared; conserved 
sequences are identified. The more divergent sequences are the 
most likely to tolerate sequence changes. Alternatively, 
sequences are modified such that they become more reactive to T- 
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and/or B-cells. (See Table below for identification of T- and 
B- epitopes.) Yet another alternative is to mutate a particular 
amino acid residue or sequence within the polypeptide in vitro, 
then screen the mutant polypeptides for their ability to prevent 
5 or treat Chlamydia infection according to the method outlined 
below . 

A person skilled in the art will readily understand that 
by following the screening process of this invention, it will be 
determined without undue experimentation whether a particular 
10 homolog of any of SEQ ID Nos : 27 to 45 may be useful in the 
prevention or treatment of Chlamydia infection. The screening 
procedure comprises the steps: 

(i) immunizing an animal, preferably mouse, with the 
test homolog or fragment; 
15 (ii) inoculating the immunized animal with Chlamydia; 

and 

(iii) selecting those homologs or fragments which confer 

protection against Chlamydia. 
By "conferring protection" is meant that there is a 
20 reduction is severity of any of the effects of Chlamydia 

infection, in comparison with a control animal which was not 
immunized with the test homolog or fragment. 

It has been previously demonstrated (Yang et. al. , 1993) 
that mice are susceptible to intranasal infection with different 
25 isolates of C. pneumoniae. Strain AR-39 (Grayston, 1989) was 
used in Balb/c mice as a challenge infection model to examine 
the capacity of chlamydia gene products delivered as naked DNA 
to elicit a protective response against a sublethal C. 
pneumoniae lung infection. Protective immunity is defined as an 
30 accelerated clearance of pulmonary infection. 

Groups of 7 to 9 week old male Balb/c mice (6 to 10 per 
group) were immunized intramuscularly (i.m.) plus intranasally 
(i.n.) with plasmid DNA containing the coding sequence of a 
C. pneumoniae polypeptide. Saline or the plasmid vector lacking 
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an inserted chlamydial gene was given to groups of control 
animals . 

For i.m. immunization alternate left and right 
quadriceps were injected with 100|4g of DNA in 50|4l of PBS on 
5 three occasions at 0, 3 and 6 weeks. For i.n. immunization, 

anaesthetized mice aspirated 50|4l of PBS containing 50 |4g DNA on 
three occasions at 0, 3 and 6 weeks. At week 8, immunized mice 
were inoculated i.n. with 5 x 10 5 IFU of C. pneumoniae, strain 
AR39 in 100(4.1 of SPG buffer to test their ability to limit the 

10 growth of a sublethal C. pneumoniae challenge. 

Lungs were taken from mice at day 9 post-challenge and 
immediately homogenised in SPG buffer (7.5% sucrose, 5mM 
glutamate, 12.5mM phosphate pH7.5). The homogenate was stored 
frozen at -70°C until assay. Dilutions of the homogenate were 

15 assayed for the presence of infectious chlamydia by inoculation 
onto monolayers of susceptible cells. The inoculum was 
centrifuged onto the cells at 3000rpm for 1 hour, then the cells 
were incubated for three days at 35°C in the presence of l|4g/ml 
cycloheximide. After incubation the monolayers were fixed with 

20 formalin and methanol then immunoperoxidase stained for the 
presence of chlamydial inclusions using convalescent sera from 
rabbits infected with C. pneumoniae and metal-enhanced DAB as a 
peroxidase substrate. 

Consistent with the first aspect of the invention, 

25 polypeptide derivatives are provided that are partial sequences 
of SEQ ID Nos : 27 to 45, partial sequences of polypeptide 
sequences homologous to SEQ ID Nos: 27 to 45, polypeptides 
derived from full-length polypeptides by internal deletion, and 
fusion proteins. 

30 It is an accepted practice in the field of immunology to 

use fragments and variants of protein immunogens as vaccines, as 
all that is required to induce an immune response to a protein 
is a small (e.g., 8 to 10 amino acid) immunogenic region of the 
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protein. Various short synthetic peptides corresponding to 
surface-exposed antigens of pathogens other than Chlamydia have 
been shown to be effective vaccine antigens against their 
respective pathogens, e.g. an 11 residue peptide of murine 
5 mammary tumor virus (Ref 38), a 16-residue peptide of Semliki 
Forest virus (Ref 39), and two overlapping peptides of 15 
residues each from canine parvovirus (Ref 40) . 

Accordingly, it will be readily apparent to one skilled 
in the art, having read the present description, that partial 

10 sequences of SEQ ID Nos: 27 to 45 or their homologous amino acid 
sequences are inherent to the full-length sequences and are 
taught by the present invention. Such polypeptide fragments 
preferably are at least 12 amino acids in length. 
Advantageously, they are at least 20 amino acids, preferably at 

15 least 50 amino acids, more preferably at least 75 amino acids, 
and most preferably at least 100 amino acids in length. 

Polynucleotides of 30 to 600 nucleotides encoding partial 
sequences of sequences homologous to SEQ ID Nos: 27 to 45 are 
retrieved by PCR amplification using the parameters outlined 

20 above and using primers matching the sequences upstream and 
downstream of the 5' and 3' ends of the fragment to be 
amplified. The template polynucleotide for such amplification 
is either the full length polynucleotide homologous to one of 
SEQ ID Nos: 1 to 26, or a polynucleotide contained in a mixture 

25 of polynucleotides such as a DNA or RNA library. As an 
alternative method for retrieving the partial sequences, 
screening hybridization is carried out under conditions 
described above and using the formula for calculating Tm. Where 
fragments of 30 to 600 nucleotides are to be retrieved, the 

30 calculated Tm is corrected by subtracting ( 600/polynucleotide 
size in base pairs) and the stringency conditions are defined by 
a hybridization temperature that is 5 to 10°C below Tm. Where 
oligonucleotides shorter than 20-30 bases are to be obtained, 
the formula for calculating the Tm is as follows: Tm = 4 x (G+C) 
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+ 2 (A+T) . For example, an 18 nucleotide fragment of 50% G+C 
would have an approximate Tm of 54 °C. Short peptides that are 
fragments of SEQ. ID Nos . 27 to 45 or their homologous 
sequences, are obtained directly by chemical synthesis (E. Gross 
5 and H. J. Meinhofer, 4 The Peptides: Analysis, Synthesis, 
Biology; Modern Techniques of Peptide Synthesis, John Wiley & 
Sons (1981), and M. Bodanzki, Principles of Peptide Synthesis, 
Springer -Verlag (1984)). 

Useful polypeptide derivatives, e.g., polypeptide 

10 fragments, are designed using computer-assisted analysis of 
amino acid sequences. This identifies probable surface- 
exposed, antigenic regions (Ref 37) . An analysis of the 13 
amino acid sequences contained in SEQ ID Nos: 27 to 45, based on 
the product of flexibility and hydrophobicity propensities using 

15 the program SEQSEE (Wishart DS, et al. " SEQSEE : a comprehensive 
program suite for protein sequence analysis." Comput Appl 
Biosci. 1994 Apr; 10 (2) : 121-32) , reveal a number of potential B- 
and T-cell epitopes which may be used as a basis for selecting 
useful immunogenic fragments and variants. The results are 

20 shown in Figures 27 to 39. This analysis uses a reasonable 
combination of external surface features that is likely to be 
recognized by antibodies. Probable T-cell epitopes for HLA- 
A0201 MHC subclass were revealed by an algorithm written at 
Connaught Laboratories that emulates an approach developed at 

25 the NIH (Parker KC, et al. "Peptide binding to MHC class I 
molecules: implications for antigenic peptide prediction." 
Immunol Res 1995 ; 14 ( 1 ): 34-57 ). 

Epitopes which induce a protective T cell-dependent 
immune response are present throughout the length of the 

30 polypeptide. However, some epitopes may be masked by secondary 
and tertiary structures of the polypeptide. To reveal such 
masked epitopes large internal deletions are created which 
remove much of the original protein structure and exposes the 
masked epitopes. Such internal deletions sometimes effects the 
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additional advantage of removing immunodominant regions of high 
variability among strains. Polynucleotides encoding polypeptide 
fragments and polypeptides having large internal deletions are 
constructed using standard methods (Ref 41). Such methods 
5 include standard PCR, inverse PCR, restriction enzyme treatment 
of cloned DNA molecules, or the method of Kunkel et al. (Ref 
42) . Components for these methods and instructions for their 
use are readily available from various commercial sources such 
as Stratagene. Once the deletion mutants have been constructed, 

10 they are tested for their ability to prevent or treat Chlamydia 
infection as described above. 

As used herein, a fusion polypeptide is one that contains 
a polypeptide or a polypeptide derivative of the invention fused 
at the N- or C-terminal end to any other polypeptide 

15 (hereinafter referred to as a peptide tail) . A simple way to 
obtain such a fusion polypeptide is by translation of an in- 
frame fusion of the polynucleotide sequences, i.e., a hybrid 
gene. The hybrid gene encoding the fusion polypeptide is 
inserted into an expression vector which is used to transform or 

20 transfect a host cell. Alternatively, the polynucleotide 

sequence encoding the polypeptide or polypeptide derivative is 
inserted into an expression vector in which the polynucleotide 
encoding the peptide tail is already present. Such vectors and 
instructions for their use are commercially available, e.g. 

25 the pMal-c2 or pMal-p2 system from New England Biolabs, in 
which the peptide tail is a maltose binding protein, the 
glutathione-S-transferase system of Pharmacia, or the His-Tag 
system available from Novagen. These and other expression 
systems provide convenient means for further purification of 

30 polypeptides and derivatives of the invention. 

An advantageous example of a fusion polypeptide is one 
where the polypeptide or homolog or fragment of the invention is 
fused to a polypeptide having adjuvant activity, such as subunit 
B of either cholera toxin or E. coli heat-labile toxin. Another 
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advantageous fusion is one where the polypeptide, homolog or 
fragment is fused to a strong T-cell epitope or B-cell epitope. 
Such an epitope may be one known in the art (e.g. the Hepatitis 
B virus core antigen, D.R. Millich et al., "Antibody production 
5 to the nucleocapsid and envelope of the Hepatitis B virus primed 
by a single synthetic T cell site", Nature. 1987. 329:547-54 9), 
or one which has been identified in another polypeptide of the 
invention (Table ) . Consistent with this aspect of the 
invention is a fusion polypeptide comprising T- or B-cell 

10 epitopes from one of SEQ ID Nos : 27 to 45 or its homolog or 
fragment, wherein the epitopes are derived from multiple 
variants of said polypeptide or homolog or fragment, each 
variant differing from another in the location and sequence of 
its epitope within the polypeptide. Such a fusion is effective 

15 in the prevention and treatment of Chlamydia infection since it 
optimizes the T- and B-cell response to the overall polypeptide, 
homolog or fragment. 

To effect fusion, the polypeptide of the invention is 
fused to the N-, or preferably, to the C-terminal end of the 

20 polypeptide having adjuvant activity or T- or B-cell epitope. 
Alternatively, a polypeptide fragment of the invention is 
inserted internally within the amino acid sequence of the 
polypeptide having adjuvant activity. The T- or B-cell epitope 
may also be inserted internally within the amino acid sequence 

25 of the polypeptide of the invention. 

Consistent with the first aspect, the polynucleotides of 
the invention also encode hybrid precursor polypeptides 
containing heterologous signal peptides, which mature into 
polypeptides of the invention. By "heterologous signal peptide" 

30 is meant a signal peptide that is not found in naturally- 
occurring precursors of polypeptides of the invention. 

A polynucleotide molecule according to the invention, 
including RNA, DNA, or modifications or combinations thereof, 
have various applications. A DNA molecule is used, for example, 
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(i) in a process for producing the encoded polypeptide in a 
recombinant host system, (ii) in the construction of vaccine 
vectors such as poxviruses, which are further used in methods 
and compositions for preventing and/or treating Chlamydia 
5 infection, (iii) as a vaccine agent (as well as an RNA 

molecule), in a naked form or formulated with a delivery vehicle 
and, (iv) in the construction of attenuated Chlamydia strains 
that can over-express a polynucleotide of the invention or 
express it in a non-toxic, mutated form. 

10 Accordingly, a second aspect of the invention encompasses 

(i) an expression cassette containing a DNA molecule of the 
invention placed under the control of the elements required for 
expression, in particular under the control of an appropriate 
promoter; (ii) an expression vector containing an expression 

15 cassette of the invention; (iii) a procaryotic or eucaryotic 
cell transformed or transfected with an expression cassette 
and/or vector of the invention, as well as (iv) a process for 
producing a polypeptide or polypeptide derivative encoded by a 
polynucleotide of the invention, which involves culturing a 

20 procaryotic or eucaryotic cell transformed or transfected with 
an expression cassette and/or vector of the invention, under 
conditions that allow expression of the DNA molecule of the 
invention and, recovering the encoded polypeptide or polypeptide 
derivative from the cell culture. 

25 A recombinant expression system is selected from 

procaryotic and eucaryotic hosts. Eucaryotic hosts include 
yeast cells (e.g., Saccharomyces cerevisiae or Pichia pastoris) , 
mammalian cells (e.g., COS1, NIH3T3, or JEG3 cells), arthropods 
cells (e.g., Spodoptera frugiperda (SF9) cells), and plant 

30 cells. A preferred expression system is a procaryotic host such 
as E. coli. Bacterial and eucaryotic cells are available from a 
number of different sources including commercial sources to 
those skilled in the art, e.g., the American Type Culture 
Collection (ATCC; Rockville, Maryland) . Commercial sources of 
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cells used for recombinant protein expression also provide 
instructions for usage of the cells. 

The choice of the expression system depends on the 
features desired for the expressed polypeptide. For example, it 
5 may be useful to produce a polypeptide of the invention in a 
particular lipidated form or any other form. 

One skilled in the art would redily understand that not 
all vectors and expression control sequences and hosts would be 
expected to express equally well the polynucleotides of this 

10 invention. With the guidelines described below, however, a 

selection of vectors, expression control sequences and hosts may 
be made without undue experimentation and without departing from 
the scope of this invention. 

In selecting a vector, the host must be chosen that is 

15 compatible with the vector which is to exist and possibly 

replicate in it. Considerations are made with respect to the 
vector copy number, the ability to control the copy number, 
expression of other proteins such as antibiotic resistance. In 
selecting an expression control sequence, a number of variables 

20 are considered. Among the important variable are the relative 
strength of the sequence (e.g. the ability to drive expression 
under various conditions), the ability to control the sequence's 
function, compatibility between the polynucleotide to be 
expressed and the control sequence (e.g. secondary structures 

25 are considered to avoid hairpin structures which prevent 

efficient transcription) . In selecting the host, unicellular 
hosts are selected which are compatible with the selected 
vector, tolerant of any possible toxic effects of the expressed 
product, able to secrete the expressed product efficiently if 

30 such is desired, to be able to express the product in the 

desired conformation, to be easily scaled up, and to which ease 
of purification of the final product. 

The choice of the expression cassette depends on the 
host system selected as well as the features desired for the 
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expressed polypeptide. Typically, an expression cassette 
includes a promoter that is functional in the selected host 
system and can be constitutive or inducible; a ribosome binding 
site; a start codon (ATG) if necessary; a region encoding a 
5 signal peptide, e.g., a lipidation signal peptide; a DNA 
molecule of the invention; a stop codon; and optionally a 3' 
terminal region (translation and/or transcription terminator) . 
The signal peptide encoding region is adjacent to the 
polynucleotide of the invention and placed in proper reading 

10 frame. The signal peptide-encoding region is homologous or 

heterologous to the DNA molecule encoding the mature polypeptide 
and is compatible with the secretion apparatus of the host used 
for expression. The open reading frame constituted by the DNA 
molecule of the invention, solely or together with the signal 

15 peptide, is placed under the control of the promoter so that 
transcription and translation occur in the host system. 
Promoters and signal peptide encoding regions are widely known 
and available to those skilled in the art and include, for 
example, the promoter of Salmonella typhimurium (and 

20 derivatives) that is inducible by arabinose (promoter araB) and 
is functional in Gram-negative bacteria such as E. coli (as 
described in U.S. Patent No. 5,028,530 and in Cagnon et al. , 
(Ref 46)); the promoter of the gene of bacteriophage T7 encoding 
RNA polymerase, that is functional in a number of E. coli 

25 strains expressing T7 polymerase (described in U.S. Patent 
No. 4,952,496); OspA lipidation signal peptide ; and RlpB 
lipidation signal peptide (Ref 47) . 

The expression cassette is typically part of an 
expression vector, which is selected for its ability to 

30 replicate in the chosen expression system. Expression vectors 
(e.g., plasmids or viral vectors) can be chosen, for example, 
from those described in Pouwels et al. (Cloning Vectors: A 
Laboratory Manual 1985, Supp. 1987) . Suitable expression 
vectors can be purchased from various commercial sources. 



SUBSTITUTE SHEET (RULE 26) 



WO 00/24765 PCT/CA99/00992 

27 

Methods for transf orming/transf ecting host cells with 
expression vectors are well-known in the art and depend on the 
host system selected as described in Ausubel et al. r (Ref 41). 
Upon expression, a recombinant polypeptide of the 
5 invention (or a polypeptide derivative) is produced and remains 
in the intracellular compartment, is secreted/excreted in the 
extracellular medium or in the periplasmic space, or is embedded 
in the cellular membrane. The polypeptide is recovered in a 
substantially purified form from the cell extract or from the 
10 supernatant after centrifugation of the recombinant cell 

culture. Typically, the recombinant polypeptide is purified by 
antibody-based affinity purification or by other well-known 
methods that can be readily adapted by a person skilled in the 
art, such as fusion of the polynucleotide encoding the 
15 polypeptide or its derivative to a small affinity binding 

domain. Antibodies useful for purifying by immunoaf f inity the 
polypeptides of the invention are obtained as described below. 

A polynucleotide of the invention can also be useful as a 
vaccine. There are two major routes, either using a viral or 
20 bacterial host as gene delivery vehicle (live vaccine vector) or 
administering the gene in a free form, e.g., inserted into a 
plasmid. Therapeutic or prophylactic efficacy of a 
polynucleotide of the invention is evaluated as described below. 

Accordingly, a third aspect of the invention provides (i) 
25 a vaccine vector such as a poxvirus, containing a DNA molecule 
of the invention, placed under the control of elements required 
for expression; (ii) a composition of matter comprising a 
vaccine vector of the invention, together with a diluent or 
carrier; specifically (iii) a pharmaceutical composition 
30 containing a therapeutically or prophylactically effective 

amount of a vaccine vector of the invention; (iv) a method for 
inducing an immune response against Chlamydia in a mammal (e.g., 
a human; alternatively, the method can be used in veterinary 
applications for treating or preventing Chlamydia infection of 
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animals, e.g., cats or birds), which involves administering to 
the mammal an immunogenically effective amount of a vaccine 
vector of the invention to elicit a protective or therapeutic 
immune response to Chlamydia ; and particularly, (v) a method 
5 for preventing and/or treating a Chlamydia (e.g., C. 

trachomatis , C. psittaci, C. pneumonia , C. pecorum) infection, 
which involves administering a prophylactic or therapeutic 
amount of a vaccine vector of the invention to an infected 
individual. Additionally, the third aspect of the invention 

10 encompasses the use of a vaccine vector of the invention in the 
preparation of a medicament for preventing and/or treating 
Chlamydia infection. 

As used herein, a vaccine vector expresses one or several 
polypeptides or derivatives of the invention, as well as at 

15 least one additional Chlamydia antigen (??), fragment, homolog, 
mutant, or derivative thereof. The vaccine vector may express 
additionally a cytokine, such as interleukin-2 (IL-2) or 
interleukin-12 (IL-12) , that enhances the immune response 
(adjuvant effect). It is understood that each of the components 

20 to be expressed is placed under the control of elements required 
for expression in a mammalian cell. 

Consistent with the third aspect of the invention is a 
composition comprising several vaccine vectors, each of them 
capable of expressing a polypeptide or derivative of the 

25 invention. A composition may also comprise a vaccine vector 
capable of expressing an additional Chlamydia antigen, or a 
subunit, fragment, homolog, mutant, or derivative thereof; or a 
cytokine such as IL-2 or IL-12. 

Vaccination methods for treating or preventing infection 

30 in a mammal comprises use of a vaccine vector of the invention 
to be administered by any conventional route , particularly to a 
mucosal (e.g., ocular, intranasal, oral, gastric, pulmonary, 
intestinal, rectal, vaginal, or urinary tract) surface or via 
the parenteral (e.g., subcutaneous, intradermal, intramuscular, 
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intravenous, or intraperitoneal) route. Preferred routes depend 
upon the choice of the vaccine vector. Treatment may be 
effected in a single dose or repeated at intervals. The 
appropriate dosage depends on various parameters understood by 
5 skilled artisans such as the vaccine vector itself, the route of 
administration or the condition of the mammal to be vaccinated 
(weight, age and the like) . 

Live vaccine vectors available in the art include viral 
vectors such as adenoviruses and poxviruses as well as bacterial 
10 vectors, e.g., Shigella, Salmonella, Vibrio cholerae, 

Lactobacillus, Bacille bilie de Calmette-Guerin (BCG) , and 
Streptococcus . 

An example of an adenovirus vector, as well as a method 
for constructing an adenovirus vector capable of expressing a 

15 DNA molecule of the invention, are described in U.S. Patent No. 
4,920,209. Poxvirus vectors include vaccinia and canary pox 
virus, described in U.S. Patent No. 4,722,848 and U.S. Patent 
No. 5,364,773, respectively. (Also see, e.g., Tartaglia et al., 
Virology (1992) 188:217) for a description of a vaccinia virus 

20 vector and Taylor et al, Vaccine (1995) 13:539 for a reference 
of a canary pox.) Poxvirus vectors capable of expressing a 
polynucleotide of the invention are obtained by homologous 
recombination as described in Kieny et al. , Nature (1984) 
312:163 so that the polynucleotide of the invention is inserted 

25 in the viral genome under appropriate conditions for expression 
in mammalian cells. Generally, the dose of vaccine viral 
vector, for therapeutic or prophylactic use, can be of from 
about lxlO 4 to about lxlO 11 , advantageously from about lxlO 7 to 
about lxlO 10 , preferably of from about lxlO 7 to about lxlO 9 

30 plaque-forming units per kilogram. Preferably, viral vectors 
are administered parenterally ; for example, in 3 doses, 4 weeks 
apart. It is preferable to avoid adding a chemical adjuvant to 
a composition containing a viral vector of the invention and 
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thereby minimizing the immune response to the viral vector 
itself. 

Non-toxicogenic Vibrio cholerae mutant strains that are 
useful as a live oral vaccine are known. Mekalanos et al., 
5 Nature (1983) 306:551 and U.S. Patent No. 4,882,278 describe 
strains which have a substantial amount of the coding sequence 
of each of the two ctxA alleles deleted so that no functional 
cholerae toxin is produced. WO 92/11354 describes a strain in 
which the irgA locus is inactivated by mutation; this mutation 

10 can be combined in a single strain with ctxA mutations. WO 

94/1533 describes a deletion mutant lacking functional ctxA and 
attRSl DNA sequences. These mutant strains are genetically 
engineered to express heterologous antigens, as described in 
WO 94/19482. An effective vaccine dose of a Vibrio cholerae 

15 strain capable of expressing a polypeptide or polypeptide 

derivative encoded by a DNA molecule of the invention contains 
about 1x10 s to about 1x10 s , preferably about 1x10 s to about 1x10 s , 
viable bacteria in a volume appropriate for the selected route 
of administration. Preferred routes of administration include 

20 all mucosal routes; most preferably, these vectors are 
administered intranasally or orally. 

Attenuated Salmonella typhimurium strains, genetically 
engineered for recombinant expression of heterologous antigens 
or not, and their use as oral vaccines are described in 

25 Nakayama et al. (Bio/Technology (1988) 6:693) and WO 92/11361. 
Preferred routes of administration include all mucosal routes; 
most preferably, these vectors are administered intranasally or 
orally. 

Other bacterial strains used as vaccine vectors in the 
30 context of the present invention are described in High et al., 
EMBO (1992) 11:1991 and Sizemore et al., Science (1995) 270:299 
(Shigella flexneri) ; Medaglini et al., Proc. Natl. Acad. Sci. 
USA (1995) 92:6868 (Streptococcus gordonii) , Flynn J.L., Cell. 
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Mol. Biol. (1994) 40 (suppl. I):31, WO 88/6626, WO 90/0594, WO 
91/13157, WO 92/1796, and WO 92/21376 (Bacille Calmette Guerin) . 

In bacterial vectors, the polynucleotide of the invention 
is inserted into the bacterial genome or remains in a free 
5 state as part of a plasmid. 

The composition comprising a vaccine bacterial vector of 
the present invention may further contain an adjuvant . A 
number of adjuvants are known to those skilled in the art. 
Preferred adjuvants are selected from the list provided below. 

10 Accordingly, a fourth aspect of the invention provides 

(i) a composition of matter comprising a polynucleotide of the 
invention, together with a diluent or carrier; (ii) a 
pharmaceutical composition comprising a therapeutically or 
prophylactically effective amount of a polynucleotide of the 

15 invention; (iii) a method for inducing an immune response 
against Chlamydia in a mammal by administration of an 
immunogenically effective amount of a polynucleotide of the 
invention to elicit a protective immune response to Chlamydia; 
and particularly, (iv) a method for preventing and/or treating a 

20 Chlamydia (e.g., C. trachomatis, C. psittaci, C. pneumoniae, or 
C. pecorum) infection, by administering a prophylactic or 
therapeutic amount of a polynucleotide of the invention to an 
infected individual. Additionally, the fourth aspect of the 
invention encompasses the use of a polynucleotide of the 

25 invention in the preparation of a medicament for preventing 
and/or treating Chlamydia infection. A preferred use includes 
the use of a DNA molecule placed under conditions for expression 
in a mammalian cell, especially in a plasmid that is unable to 
replicate in mammalian cells and to substantially integrate in a 

30 mammalian genome. 

Use of the polynucleotides of the invention include their 
administration to a mammal as a vaccine, for therapeutic or 
prophylactic purposes. Such polynucleotides are used in the 
form of DNA as part of a plasmid that is unable to replicate in 
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a mammalian cell and unable to integrate into the mammalian 
genome. Typically, such a DNA molecule is placed under the 
control of a promoter suitable for expression in a mammalian 
cell. The promoter functions either ubiquitously or tissue- 
5 specifically. Examples of non-tissue specific promoters include 
the early Cytomegalovirus (CMV) promoter (described in U.S. 
Patent No. 4,168,062) and the Rous Sarcoma Virus promoter 
(described in Norton & Coffin, Molec. Cell Biol. (1985) 5:281). 
An example of a tissue-specific promoter is the desmin promoter 

10 which drives expression in muscle cells (Li et al. , Gene (1989) 
78:243, Li & Paulin, J. Biol. Chem. (1991) 266:6562 and Li & 
Paulin, J. Biol. Chem. (1993) 268:10403). Use of promoters is 
well-known to those skilled in the art. Useful vectors are 
described in numerous publications, specifically WO 94/21797 and 

15 Hartikka et al., Human Gene Therapy (1996) 7:1205. 

Polynucleotides of the invention which are used as a 
vaccine encode either a precursor or a mature form of the 
corresponding polypeptide. In the precursor form, the signal 
peptide is either homologous or heterologous. In the latter 

20 case, a eucaryotic leader sequence such as the leader sequence 
of the tissue-type plasminogen factor (tPA) is preferred. 

As used herein, a composition of the invention contains 
one or several polynucleotides with optionally at least one 
additional polynucleotide encoding another Chlamydia antigen 

25 such as urease subunit A, B, or both, or a fragment, derivative, 
mutant, or analog thereof. The composition may also contain an 
additional polynucleotide encoding a cytokine, such as 
interleukin-2 (IL-2) or interleukin-12 (IL-12) so that the 
immune response is enhanced. These additional polynucleotides 

30 are placed under appropriate control for expression. 

Advantageously, DNA molecules of the invention and/or additional 
DNA molecules to be included in the same composition, are 
present in the same plasmid. 
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Standard techniques of molecular biology for preparing 
and purifying polynucleotides are used in the preparation of 
polynucleotide therapeutics of the invention. For use as a 
vaccine, a polynucleotide of the invention is formulated 
5 according to various methods outlined below. 

One method utililizes the polynucleotide in a naked 
form, free of any delivery vehicles. Such a polynucleotide is 
simply diluted in a physiologically acceptable solution such as 
sterile saline or sterile buffered saline, with or without a 

10 carrier. When present, the carrier preferably is isotonic, 

hypotonic, or weakly hypertonic, and has a relatively low ionic 
strength, such as provided by a sucrose solution, e.g., a 
solution containing 20% sucrose. 

An alternative method utilizes the polynucleotide in 

15 association with agents that assist in cellular uptake. 

Examples of such agents are (i) chemicals that modify cellular 
permeability, such as bupivacaine (see, e.g., WO 94/16737), (ii) 
liposomes for encapsulation of the polynucleotide, or 
(iii) cationic lipids or silica, gold, or tungsten 

20 microparticles which associate themselves with the 
polynucleotides . 

Anionic and neutral liposomes are well-known in the art 
(see, e.g., Liposomes: A Practical Approach, RPC New Ed, IRL 
press (1990), for a detailed description of methods for making 

25 liposomes) and are useful for delivering a large range of 

products, including polynucleotides. Cationic lipids are also 
known in the art and are commonly used for gene delivery. Such 
lipids include Lipofectin™ also known as DOTMA (N-[l-(2,3- 
dioleyloxy) propyl] -N, N, N-trimethylammonium chloride), DOTAP 

30 (1, 2-bis (oleyloxy) -3- ( trimethylammonio) propane) , DDAB 
(dimethyldioctadecylammonium bromide) , DOGS 

(dioctadecylamidologlycyl spermine) and cholesterol derivatives 
such as DC-Choi (3 beta- (N- (N ', N ' -dimethyl aminomethane ) - 
carbamoyl) cholesterol) . A description of these cationic lipids 
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can be found in EP 187,702, WO 90/11092, U.S. Patent 
No. 5,283,185, WO 91/15501, WO 95/26356, and U.S. Patent 
No. 5,527,928. Cationic lipids for gene delivery are preferably 
used in association with a neutral lipid such as DOPE (dioleyl 
5 phosphatidylethanolamine) , as described in WO 90/11092 as an 
example . 

Formulation containing cationic liposomes may optionally 
contain other transf ection-f acilitating compounds. A number of 
them are described in WO 93/18759, WO 93/19768, WO 94/25608, and 

10 WO 95/2397. They include spermine derivatives useful for 

facilitating the transport of DNA through the nuclear membrane 
(see, for example, WO 93/18759) and membrane-permeabilizing 
compounds such as GALA, Gramicidine S, and cationic bile salts 
(see, for example, WO 93/19768). 

15 Gold or tungsten microparticles are used for gene 

delivery, as described in WO 91/359, WO 93/17706, and Tang et 
al. (Nature (1992) 356:152). The microparticle-coated 
polynucleotide is injected via intradermal or intraepidermal 
routes using a needleless injection device ("gene gun"), such as 

20 those described in U.S. Patent No. 4,945,050, U.S. Patent 
No. 5,015,580, and WO 94/24263. 

The amount of DNA to be used in a vaccine recipient 
depends, e.g., on the strength of the promoter used in the DNA 
construct, the immunogenicity of the expressed gene product, the 

25 condition of the mammal intended for administration (e.g., the 
weight, age, and general health of the mammal) , the mode of 
administration, and the type of formulation. In general, a 
therapeutically or prophylactically effective dose from about 1 
lag to about 1 mg, preferably, from about 10 ug to about 800 ug 

30 and, more preferably, from about 25 ug to about 250 ug, can be 
administered to human adults. The administration can be 
achieved in a single dose or repeated at intervals. 

The route of administration is any conventional route 
used in the vaccine field. As general guidance, a 
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polynucleotide of the invention is administered via a mucosal 
surface, e.g., an ocular, intranasal, pulmonary, oral, 
intestinal, rectal, vaginal, and urinary tract surface; or via a 
parenteral route, e.g., by an intravenous, subcutaneous, 
5 intraperitoneal, intradermal, intraepidermal, or intramuscular 
route. The choice of administration route depends on the 
formulation that is selected. A polynucleotide formulated in 
association with bupivacaine is advantageously administered into 
muscles. When a neutral or anionic liposome or a cationic 

10 lipid, such as DOTMA or DC-Choi, is used, the formulation can be 
advantageously injected via intravenous, intranasal 
(aerosolization) , intramuscular, intradermal, and subcutaneous 
routes. A polynucleotide in a naked form can advantageously be 
administered via the intramuscular, intradermal, or sub- 

15 cutaneous routes. 

Although not absolutely required, such a composition can 
also contain an adjuvant. If so, a systemic adjuvant that does 
not require concomitant administration in order to exhibit an 
adjuvant effect is preferable such as, e.g., QS21, which is 

20 described in U.S. Patent No. 5,057,546. 

The sequence information provided in the present 
application enables the design of specific nucleotide probes and 
primers that are used for diagnostic purposes. Accordingly, a 
fifth aspect of the invention provides a nucleotide probe or 

25 primer having a sequence found in or derived by degeneracy of 
the genetic code from a sequence shown in any one of SEQ ID 
Nos: 1 to 26. 

The term "probe" as used in the present application 
refers to DNA (preferably single stranded) or RNA molecules (or 
30 modifications or combinations thereof) that hybridize under the 
stringent conditions, as defined above, to nucleic acid 
molecules having SEQ ID Nos: 1 to 26 or to sequences homologous 
to SEQ ID Nos:l to 26, or to their complementary or anti-sense 
sequences. Generally, probes are significantly shorter than 
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full-length sequences . Such probes contain from about 5 to 
about 100, preferably from about 10 to about 80, nucleotides. 
In particular, probes have sequences that are at least 75%, 
preferably at least 85%, more preferably 95% homologous to a 
5 portion of any of SEQ ID Nos:l to 26 or that are complementary 
to such sequences. Probes may contain modified bases such as 
inosine, methyl-5-deoxycytidine , deoxyuridine, dimethylamino-5- 
deoxyuridine, or diamino-2, 6-purine. Sugar or phosphate 
residues may also be modified or substituted. For example, a 

10 deoxyribose residue may be replaced by a polyamide (Nielsen et 
al., Science (1991) 254:1497) and phosphate residues may be 
replaced by ester groups such as diphosphate, alkyl, 
arylphosphonate and phosphorothioate esters. In addition, the 
2 ' -hydroxyl group on ribonucleotides may be modified by 

15 including such groups as alkyl groups. 

Probes of the invention are used in diagnostic tests, as 
capture or detection probes. Such capture probes are 
conventionally immobilized on a solid support, directly or 
indirectly, by covalent means or by passive adsorption. A 

20 detection probe is labelled by a detection marker selected from: 
radioactive isotopes, enzymes such as peroxidase, alkaline 
phosphatase, and enzymes able to hydrolyze a chromogenic, 
fluorogenic, or luminescent substrate, compounds that are 
chromogenic, fluorogenic, or luminescent, nucleotide base 

25 analogs, and biotin. 

Probes of the invention are used in any conventional 
hybridization technique, such as dot blot (Maniatis et al. , 
Molecular Cloning: A Laboratory Manual (1982) Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York) , Southern blot 

30 (Southern, J. Mol. Biol. (1975) 98:503), northern blot 

(identical to Southern blot with the exception that RNA is used 
as a target), or the sandwich technique (Dunn et al., Cell 
(1977) 12:23). The latter technique involves the use of a 
specific capture probe and/or a specific detection probe with 
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nucleotide sequences that at least partially differ from each 
other . 

A primer is a probe of usually about 10 to about 
40 nucleotides that is used to initiate enzymatic polymerization 
5 of DNA in an amplification process (e.g., PCR) , in an elongation 
process, or in a reverse transcription method. Primers used in 
diagnostic methods involving PCR are labeled by methods known in 
the art. 

As described herein, the invention also encompasses (i) a 

10 reagent comprising a probe of the invention for detecting and/or 
identifying the presence of Chlamydia in a biological material; 
(ii) a method for detecting and/or identifying the presence of 
Chlamydia in a biological material, in which (a) a sample is 
recovered or derived from the biological material, (b) DNA or 

15 RNA is extracted from the material and denatured, and (c) 
exposed to a probe of the invention, for example, a capture, 
detection probe or both, under stringent hybridization 
conditions, such that hybridization is detected; and (iii) a 
method for detecting and/or identifying the presence of 

20 Chlamydia in a biological material, in which (a) a sample is 
recovered or derived from the biological material, (b) DNA is 
extracted therefrom, (c) the extracted DNA is primed with at 
least one, and preferably two, primers of the invention and 
amplified by polymerase chain reaction, and (d) the amplified 

25 DNA fragment is produced. 

It is apparent that disclosure of polynucleotide 
sequences of SEQ ID Nos : 1 to 26, their homolog, and partial 
sequences of either enable their corresponding amino acid 
sequences. Accordingly, a sixth aspect of the invention 

30 features a substantially purified polypeptide or polypeptide 
derivative having an amino acid sequence encoded by a 
polynucleotide of the invention. 

A "substantially purified polypeptide" as used herein is 
defined as a polypeptide that is separated from the environment 
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in which it naturally occurs and/or that is free of the majority 
of the polypeptides that are present in the environment in which 
it was synthesized. For example, a substantially purified 
polypeptide is free from cytoplasmic polypeptides. Those 
5 skilled in the art would readily understand that the 

polypeptides of the invention may be purified from a natural 
source, i.e., a Chlamydia strain, or produced by recombinant 
means . 

Consistent with the sixth aspect of the invention are 

10 polypeptides, homologs or fragments which are modified or 

treated to enhance their immunogenicity in the target animal, in 
whom the polypeptide, homolog or fragments are intended to 
confer protection against Chlamydia. Such modifications or 
treatments include: amino acid substitutions with an amino acid 

15 derivative such as 3-methyhistidine, 4-hydroxyproline, 5- 

hydroxylysine etc., modifications or deletions which are carried 
out after preparation of the polypeptide, homolog or fragment, 
such as the modification of free amino, carboxyl or hydroxyl 
side groups of the amino acids. 

20 Identification of homologous polypeptides or polypeptide 

derivatives encoded by polynucleotides of the invention which 
have specific antigenicity is achieved by screening for cross- 
reactivity with an antiserum raised against the polypeptide of 
reference having an amino acid sequence of any one of SEQ ID 

25 Nos: 27 to 45. The procedure is as follows: a monospecific 
hyperimmune antiserum is raised against a purified reference 
polypeptide, a fusion polypeptide (for example, an expression 
product of MBP, GST, or His-tag systems) , or a synthetic peptide 
predicted to be antigenic. Where an antiserum is raised 

30 against a fusion polypeptide, two different fusion systems are 
employed. Specific antigenicity can be determined according to 
a number of methods, including Western blot (Towbin et al. , 
Proc. Natl. Acad. Sci. USA (1979) 76:4350), dot blot, and ELISA, 
as described below. 
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In a Western blot assay, the product to be screened, 
either as a purified preparation or a total E. coli extract, is 
submitted to SDS-Page electrophoresis as described by Laemmli 
(Nature (1970) 227:680). After transfer to a nitrocellulose 
5 membrane, the material is further incubated with the 

monospecific hyperimmune antiserum diluted in the range of 
dilutions from about 1:5 to about 1:5000, preferably from about 
1:100 to about 1:500. Specific antigenicity is shown once a 
band corresponding to the product exhibits reactivity at any of 

10 the dilutions in the above range. 

In an ELISA assay, the product to be screened is 
preferably used as the coating antigen. A purified preparation 
is preferred, although a whole cell extract can also be used. 
Briefly, about 100 ul of a preparation at about 10 ug protein/ml 

15 are distributed into wells of a 96-well polycarbonate ELISA 
plate. The plate is incubated for 2 hours at 37 °C then 
overnight at 4°C. The plate is washed with phosphate buffer 
saline (PBS) containing 0.05% Tween 20 (PBS/Tween buffer). The 
wells are saturated with 250 ul PBS containing 1% bovine serum 

20 albumin (BSA) to prevent non-specific antibody binding. After 1 
hour incubation at 37 °C, the plate is washed with PBS/Tween 
buffer. The antiserum is serially diluted in PBS/Tween buffer 
containing 0.5% BSA. 100 ul of dilutions are added per well. 
The plate is incubated for 90 minutes at 37 °C, washed and 

25 evaluated according to standard procedures. For example, a goat 
anti-rabbit peroxidase conjugate is added to the wells when 
specific antibodies were raised in rabbits. Incubation is 
carried out for 90 minutes at 37 °C and the plate is washed. The 
reaction is developed with the appropriate substrate and the 

30 reaction is measured by colorimetry (absorbance measured 
spectrophotometrically) . Under the above experimental 
conditions, a positive reaction is shown by O.D. values greater 
than a non immune control serum. 
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In a dot blot assay, a purified product is preferred, 
although a whole cell extract can also be used. Briefly, a 
solution of the product at about 100 jag /ml is serially two-fold 
diluted in 50 mM Tris-HCl (pH 7.5). 100 ul of each dilution are 
5 applied to a nitrocellulose membrane 0.45 um set in a 96-well 
dot blot apparatus (Biorad) . The buffer is removed by applying 
vacuum to the system. Wells are washed by addition of 50 mM 
Tris-HCl (pH 7.5) and the membrane is air-dried. The membrane 
is saturated in blocking buffer (50 mM Tris-HCl (pH 7.5) 0.15 M 

10 NaCl, 10 g/L skim milk) and incubated with an antiserum dilution 
from about 1:50 to about 1:5000, preferably about 1:500. The 
reaction is revealed according to standard procedures. For 
example, a goat anti-rabbit peroxidase conjugate is added to the 
wells when rabbit antibodies are used. Incubation is carried 

15 out 90 minutes at 37°C and the blot is washed. The reaction is 
developed with the appropriate substrate and stopped. The 
reaction is measured visually by the appearance of a colored 
spot, e.g., by colorimetry. Under the above experimental 
conditions, a positive reaction is shown once a colored spot is 

20 associated with a dilution of at least about 1:5, preferably of 
at least about 1:500. 

Therapeutic or prophylactic efficacy of a polypeptide or 
derivative of the invention can be evaluated as described below. 
A seventh aspect of the invention provides (i) a composition of 

25 matter comprising a polypeptide of the invention together with a 
diluent or carrier; specifically (ii) a pharmaceutical 
composition containing a therapeutically or prophylactically 
effective amount of a polypeptide of the invention; (iii) a 
method for inducing an immune response against Chlamydia in a 

30 mammal, by administering to the mammal an immunogenically 

effective amount of a polypeptide of the invention to elicit a 
protective immune response to Chlamydia; and particularly, (iv) 
a method for preventing and/or treating a Chlamydia (e.g., C. 
trachomatis . C. psittaci, C. pneumoniae, or C. pecorum) 
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infection, by administering a prophylactic or therapeutic amount 
of a polypeptide of the invention to an infected individual. 
Additionally, the seventh aspect of the invention encompasses 
the use of a polypeptide of the invention in the preparation of 
5 a medicament for preventing and/or treating Chlamydia infection. 
As used herein, the immunogenic compositions of the 
invention are administered by conventional routes known the 
vaccine field, in particular to a mucosal (e.g., ocular, 
intranasal, pulmonary, oral, gastric, intestinal, rectal, 

10 vaginal, or urinary tract) surface or via the parenteral (e.g., 
subcutaneous, intradermal, intramuscular, intravenous, or 
intraperitoneal) route. The choice of administration route 
depends upon a number of parameters, such as the adjuvant 
associated with the polypeptide. If a mucosal adjuvant is used, 

15 the intranasal or oral route is preferred. If a lipid 

formulation or an aluminum compound is used, the parenteral 
route is preferred with the sub-cutaneous or intramuscular route 
being most preferred. The choice also depends upon the nature 
of the vaccine agent. For example, a polypeptide of the 

20 invention fused to CTB or LTB is best administered to a mucosal 
surface . 

As used herein, the composition of the invention contains 
one or several polypeptides or derivatives of the invention. 
The composition optionally contains at least one additional 

25 Chlamydia antigen, or a subunit, fragment, homolog, mutant, or 
derivative thereof. 

For use in a composition of the invention, a polypeptide 
or derivative thereof is formulated into or with liposomes, 
preferably neutral or anionic liposomes, microspheres, ISCOMS, 

30 or virus-like-particles (VLPs) to facilitate delivery and/or 
enhance the immune response. These compounds are readily 
available to one skilled in the art; for example, see Liposomes: 
A Practical Approach (supra) . 
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Adjuvants other than liposomes and the like are also used 
and- are known in the art. Adjuvants may protect the antigen 
from rapid dispersal by sequestering it in a local deposit, or 
they may contain substances that stimulate the host to secrete 
5 factors that are chemotactic for macrophages and other 

components of the immune system. An appropriate selection can 
conventionally be made by those skilled in the art, for example, 
from those described below. 

Treatment is achieved in a single dose or repeated as 

10 necessary at intervals, as can be determined readily by one 

skilled in the art. For example, a priming dose is followed by 
three booster doses at weekly or monthly intervals. An 
appropriate dose depends on various parameters including the 
recipient (e.g., adult or infant), the particular vaccine 

15 antigen, the route and frequency of administration, the 

presence/absence or type of adjuvant, and the desired effect 
(e.g., protection and/or treatment), as can be determined by one 
skilled in the art. In general, a vaccine antigen of the 
invention is administered by a mucosal route in an amount from 

20 about 10 ug to about 500 mg, preferably from about 1 mg to about 
200 mg. For the parenteral route of administration, the dose 
usually does not exceed about 1 mg, preferably about 100 ug. 

When used as vaccine agents, polynucleotides and 
polypeptides of the invention may be used sequentially as part 

25 of a multistep immunization process. For example, a mammal is 
initially primed with a vaccine vector of the invention such as 
a pox virus, e.g., via the parenteral route, and then boosted 
twice with the polypeptide encoded by the vaccine vector, e.g., 
via the mucosal route. In another example, liposomes associated 

30 with a polypeptide or derivative of the invention is also used 
for priming, with boosting being carried out mucosally using a 
soluble polypeptide or derivative of the invention in 
combination with a mucosal adjuvant (e.g., LT) . 
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A polypeptide derivative of the invention is also used in 
accordance with the seventh aspect as a diagnostic reagent for 
detecting the presence of ant i- Chlamydia antibodies, e.g., in a 
blood sample. Such polypeptides are about 5 to about 80, 
5 preferably about 10 to about 50 amino acids in length. They are 
either labeled or unlabeled, depending upon the diagnostic 
method. Diagnostic methods involving such a reagent are 
described below. 

Upon expression of a DNA molecule of the invention, a 

10 polypeptide or polypeptide derivative is produced and purified 
using known laboratory techniques. As described above, the 
polypeptide or polypeptide derivative may be produced as a 
fusion protein containing a fused tail that facilitates 
purification. The fusion product is used to immunize a small 

15 mammal, e.g., a mouse or a rabbit, in order to raise antibodies 
against the polypeptide or polypeptide derivative (monospecific 
antibodies) . Accordingly, an eighth aspect of the invention 
provides a monospecific antibody that binds to a polypeptide or 
polypeptide derivative of the invention. 

20 By "monospecific antibody" is meant an antibody that is 

capable of reacting with a unique naturally-occurring Chlamydia 
polypeptide. An antibody of the invention is either polyclonal 
or monoclonal. Monospecific antibodies may be recombinant, 
e.g., chimeric (e.g., constituted by a variable region of murine 

25 origin associated with a human constant region) , humanized (a 
human immunoglobulin constant backbone together with 
hypervariable region of animal, e.g., murine, origin), and/or 
single chain. Both polyclonal and monospecific antibodies may 
also be in the form of immunoglobulin fragments, e.g., F(ab)'2 

30 or Fab fragments. The antibodies of the invention are of any 
isotype, e.g., IgG or IgA, and polyclonal antibodies are of a 
single isotype or a mixture of isotypes. 

Antibodies against the polypeptides, homologs or 
fragments of the present invention are generated by immunization 
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of a mammal with a composition comprising said polypeptide, 
homolog or fragment. Scu antibodies may be polyclonal or 
monoclonal. Methods to produce polyclonal or monoclonal 
antibodies are well known in the art. For a review, see 
5 "Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory, 
Eds. E. Harlow and D. Lane (1988), and D.E. Yelton et al . , 1981. 
Ann. Rev. Biochem. 50:657-680. For monoclonal antibodies, see 
Kohl and Milstein?... 

The antibodies of the invention, which are raised to a 

10 polypeptide or polypeptide derivative of the invention, are 
produced and identified using standard immunological assays, 
e.g., Western blot analysis, dot blot assay, or ELISA (see, 
e.g., Coligan et al., Current Protocols in Immunology (1994) 
John Wiley & Sons, Inc., New York, NY). The antibodies are used 

15 in diagnostic methods to detect the presence of a Chlamydia 
antigen in a sample, such as a biological sample. The 
antibodies are also used in affinity chromatography for 
purifying a polypeptide or polypeptide derivative of the 
invention. As is discussed further below, such antibodies may 

20 be used in prophylactic and therapeutic passive immunization 
methods . 

Accordingly, a ninth aspect of the invention provides 
(i) a reagent for detecting the presence of Chlamydia in a 
biological sample that contains an antibody, polypeptide, or 

25 polypeptide derivative of the invention; and (ii) a diagnostic 
method for detecting the presence of Chlamydia in a biological 
sample, by contacting the biological sample with an antibody, a 
polypeptide, or a polypeptide derivative of the invention, such 
that an immune complex is formed, and by detecting such complex 

30 to indicate the presence of Chlamydia in the sample or the 
organism from which the sample is derived. 

Those skilled in the art will readily understand that the 
immune complex is formed between a component of the sample and 
the antibody, polypeptide, or polypeptide derivative, whichever 
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is used, and that any unbound material is removed prior to 
detecting the complex. It is understood that a polypeptide 
reagent is useful for detecting the presence of ant i- Chlamydia 
antibodies in a sample, e.g., a blood sample, while an antibody 
5 of the invention is used for screening a sample, such as a 
gastric extract or biopsy, for the presence of Chlamydia 
polypeptides . 

For diagnostic applications, the reagent (i.e., the 
antibody, polypeptide, or polypeptide derivative of the 

10 invention) is either in a free state or immobilized on a solid 
support, such as a tube, a bead, or any other conventional 
support used in the field. Immobilization is achieved using 
direct or indirect means. Direct means include passive 
adsorption (non-covalent binding) or covalent binding between 

15 the support and the reagent. By "indirect means" is meant that 
an anti-reagent compound that interacts with a reagent is first 
attached to the solid support. For example, if a polypeptide 
reagent is used, an antibody that binds to it can serve as an 
anti-reagent, provided that it binds to an epitope that is not 

20 involved in the recognition of antibodies in biological samples. 
Indirect means may also employ a ligand-receptor system, for 
example, where a molecule such as a vitamin is grafted onto the 
polypeptide reagent and the corresponding receptor immobilized 
on the solid phase. This is illustrated by the biotin- 

25 streptavidin system. Alternatively, a peptide tail is added 
chemically or by genetic engineering to the reagent and the 
grafted or fused product immobilized by passive adsorption or 
covalent linkage of the peptide tail. 

Such diagnostic agents may be included in a kit which 

30 also comprises instructions for use. The reagent are labeled 
with a detection means which allows for the detection of the 
reagent when it is bound to its target. The detection means may 
be a fluorescent agent such as fluorescein isocyanate or 
fluorescein isothiocyanate, or an enzyme such as horse radish 
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peroxidase or luciferase or alkaline phosphatase, or a 
radioactive element such as 125 I or 51 Cr. 

Accordingly, a tenth aspect of the invention provides a 
process for purifying, from a biological sample, a polypeptide 
5 or polypeptide derivative of the invention, which involves 
carrying out antibody-based affinity chromatography with the 
biological sample, wherein the antibody is a monospecific 
antibody of the invention. 

For use in a purification process of the invention, the 

10 antibody is either polyclonal or monospecific, and preferably is 
of the IgG type. Purified IgGs is prepared from an antiserum 
using standard methods (see, e.g., Coligan et al., supra). 
Conventional chromatography supports, as well as standard 
methods for grafting antibodies, are described in, e.g., 

15 Antibodies: A Laboratory Manual, D. Lane, E. Harlow, Eds. (1988) 
and outlined below. 

Briefly, a biological sample, such as an C. pneumoniae 
extract preferably in a buffer solution, is applied to a 
chromatography material, preferably equilibrated with the buffer 

20 used to dilute the biological sample so that the polypeptide or 
polypeptide derivative of the invention (i.e., the antigen) is 
allowed to adsorb onto the material. The chromatography 
material, such as a gel or a resin coupled to an antibody of the 
invention, is in either a batch form or a column. The unbound 

25 components are washed off and the antigen is then eluted with an 
appropriate elution buffer, such as a glycine buffer or a buffer 
containing a chaotropic agent, e.g., guanidine HC1, or high salt 
concentration (e.g., 3 M MgCl 2 ) . Eluted fractions are recovered 
and the presence of the antigen is detected, e.g., by measuring 

30 the absorbance at 280 nm. 

An eleventh aspect of the invention provides (i) a 
composition of matter comprising a monospecific antibody of the 
invention, together with a diluent or carrier; (ii) a 
pharmaceutical composition comprising a therapeutically or 
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prophylactically effective amount of a monospecific antibody of 
the invention, and (iii) a method for treating or preventing a 
Chlamydia (e.g., C. trachomatis, C. psittaci, C. pneumoniae or 
C. pecorum) infection, by administering a therapeutic or 
5 prophylactic amount of a monospecific antibody of the invention 
to an infected individual. Additionally, the eleventh aspect of 
the invention encompasses the use of a monospecific antibody of 
the invention in the preparation of a medicament for treating or 
preventing Chlamydia infection. 

10 The monospecific antibody is either polyclonal or 

monoclonal, preferably of the IgA isotype (predominantly) . In 
passive immunization, the antibody is administered to a mucosal 
surface of a mammal, e.g., the gastric mucosa, e.g., orally or 
intragastrically, advantageously, in the presence of a 

15 bicarbonate buffer. Alternatively, systemic administration, not 
requiring a bicarbonate buffer, is carried out. A monospecific 
antibody of the invention is administered as a single active 
component or as a mixture with at least one monospecific 
antibody specific for a different Chlamydia polypeptide. The 

20 amount of antibody and the particular regimen used are readily 
determined by one skilled in the art. For example, daily 
administration of about 100 to 1,000 mg of antibodies over one 
week, or three doses per day of about 100 to 1,000 mg of 
antibodies over two or three days, are effective regimens for 

25 most purposes. 

Therapeutic or prophylactic efficacy are evaluated using 
standard methods in the art, e.g., by measuring induction of a 
mucosal immune response or induction of protective and/or 
therapeutic immunity, using, e.g., the C. pneumoniae mouse 

30 model. Those skilled in the art will readily recognize that the 
C. pneumoniae strain of the model may be replaced with another 
Chlamydia strain. For example, the efficacy of DNA molecules 
and polypeptides from C. pneumoniae is preferably evaluated in a 
mouse model using C. pneumoniae strain. Protection is 
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determined by comparing the degree of Chlamydia infection to 
that of a control group. Protection is shown when infection is 
reduced by comparison to the control group. Such an evaluation 
is made for polynucleotides, vaccine vectors, polypeptides and 
5 derivatives thereof, as well as antibodies of the invention. 

Adjuvants useful in any of the vaccine compositions 
described above are as follows. 

Adjuvants for parenteral administration include aluminum 
compounds, such as aluminum hydroxide, aluminum phosphate, and 

10 aluminum hydroxy phosphate. The antigen is precipitated with, 
or adsorbed onto, the aluminum compound according to standard 
protocols. Other adjuvants, such as RIBI ( ImmunoChem, Hamilton, 
MT) , is used in parenteral administration. 

Adjuvants for mucosal administration include bacterial 

15 toxins, e.g., the cholera toxin (CT) , the E. coli heat-labile 
toxin (LT) , the Clostridium difficile toxin A and the pertussis 
toxin (PT) , or combinations, subunits, toxoids, or mutants 
thereof such as a purified preparation of native cholera toxin 
subunit B (CTB) . Fragments, homologs, derivatives, and fusions 

20 to any of these toxins are also suitable, provided that they 
retain adjuvant activity. Preferably, a mutant having reduced 
toxicity is used. Suitable mutants are described, e.g., in WO 
95/17211 (Arg-7-Lys CT mutant), WO 96/6627 (Arg-192-Gly LT 
mutant), and WO 95/34323 (Arg-9-Lys and Glu-129-Gly PT mutant). 

25 Additional LT mutants that are used in the methods and 

compositions of the invention include, e.g., Ser-63-Lys, Ala-69- 
Gly, Glu-110-Asp, and Glu-112-Asp mutants. Other adjuvants, 
such as a bacterial monophosphoryl lipid A (MPLA) of, e.g., E. 
coli, Salmonella minnesota , Salmonella typhimurium, or Shigella 

30 flexneri; saponins, or polylactide glycolide (PLGA) 

microspheres, is also be used in mucosal administration. 

Adjuvants useful for both mucosal and parenteral 
administrations include polyphosphazene (WO 95/2415), DC-chol (3 
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b- (N- (N ', N ' -dimethyl aminomethane) -carbamoyl ) cholesterol; U.S. 
Patent No. 5,283,185 and WO 96/14831) and QS-21 (WO 88/9336). 

Any pharmaceutical composition of the invention 
containing a polynucleotide, a polypeptide, a polypeptide 
5 derivative, or an antibody of the invention, is manufactured in 
a conventional manner. In particular, it is formulated with a 
pharmaceutical^ acceptable diluent or carrier, e.g., water or a 
saline solution such as phosphate buffer saline. In general, a 
diluent or carrier is selected on the basis of the mode and 

10 route of administration, and standard pharmaceutical practice. 
Suitable pharmaceutical carriers or diluents, as well as 
pharmaceutical necessities for their use in pharmaceutical 
formulations, are described in Remington' s Pharmaceutical 
Sciences, a standard reference text in this field and in the 

15 USP/NF. 

The invention also includes methods in which Chlamydia 
infection are treated by oral administration of a Chlamydia 
polypeptide of the invention and a mucosal adjuvant, in 
combination with an antibiotic, an antacid, sucralfate, or a 

20 combination thereof. Examples of such compounds that can be 
administered with the vaccine antigen and the adjuvant are 
antibiotics, including, e.g., macrolides, tetracyclines, and 
derivatives thereof (specific examples of antibiotics that can 
be used include azithromycin or doxicyclin or immunomodulators 

25 such as cytokines or steroids) . In addition, compounds 

containing more than one of the above-listed components coupled 
together, are used. The invention also includes compositions 
for carrying out these methods, i.e., compositions containing a 
Chlamydia antigen (or antigens) of the invention, an adjuvant, 

30 and one or more of the above-listed compounds, in a 
pharmaceutical^ acceptable carrier or diluent. 

Amounts of the above-listed compounds used in the methods 
and compositions of the invention are readily determined by one 
skilled in the art. Treatment /immunization schedules are also 
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known and readily designed by one skilled in the art. For 
example, the non-vaccine components can be administered on days 
1-14, and the vaccine antigen + adjuvant can be administered on 
days 7, 14, 21, and 28. 
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1. A nucleic acid molecule comprising a nucleic acid 
sequence which encodes a polypeptide selected from any of: 

5 (a) SEQ ID Nos : 27 to 45; 

(b) an immunogenic fragment comprising at least 12 
consecutive amino acids from a polypeptide of (a) ; and 

(c) a polypeptide of (a) or (b) which has been modified to 
improve its immunogenicity , wherein said modified 

10 polypeptide is at least 75% identical in amino acid 

sequence to the corresponding polypeptide of (a) or 
(b) . 

2. A nucleic acid molecule comprising a nucleic acid 
15 sequence selected from any of: 

(a) SEQ ID Nos: 1 to 26; 

(b) a sequence which encodes a polypeptide encoded by any 
one of SEQ ID Nos: 1 to 26; 

(c) a sequence comprising at least 38 consecutive 

20 nucleotides from any one of the nucleic acid sequences 

of (a) and (b) ; and 

(d) a sequence which encodes a polypeptide which is at 
least 75% identical in amino acid sequence to any one 
of the polypeptides encoded by SEQ ID Nos: 1 to 26. 
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A nucleic acid molecule comprising a nucleic acid 
sequence which encodes a fusion protein, said fusion 
protein comprising a polypeptide encoded by a nucleic acid 
molecule according to claim 1 and an additional 
polypeptide . 

A nucleic acid molecule according to claim 1, 
operatively linked to one or more expression control 
sequences . 

A vaccine comprising at least one first nucleic acid 
according to any one of claims 1 to 4 and a vaccine vector 
wherein each first nucleic acid is expressed as a 
polypeptide, the vaccine optionally comprising a second 
nucleic acid encoding an additional polypeptide which 
enhances the immune response to the polypeptide expressed 
by said first nucleic acid. 

The vaccine of claim 5 wherein the second nucleic acid 
encodes an additional Chlamydia polypeptide. 

A pharmaceutical composition comprising a nucleic acid 
according to any one of claims 1 to 5 and a 
pharmaceutically acceptable carrier. 
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A pharmaceutical composition comprising a vaccine 
according to claim 5 or 6 and a pharmaceutically acceptable 
carrier. 

A unicellular host transformed with the nucleic acid 
molecule of claim 4. 



10. A nucleic acid probe of 5 to 100 nucleotides which 

hybridizes under stringent conditions to any one of nucleic 
10 acid molecules of SEQ ID Nos : 1 to 26, or to a homolog or 

complementary or anti-sense sequence of said nucleic acid 
molecule . 



11. A primer of 10 to 40 nucleotides which hybridizes 

15 under stringent conditions to any one of nucleic acid 

molecules of SEQ ID Nos: 1 to 26, or to a homolog or 
complementary or anti-sense sequence of said nucleic acid 
molecule . 



20 12. A polypeptide encoded by a nucleic acid sequence 

according to any one of claims 1 to 4 . 



13. A polypeptide comprising an amino acid sequence 

selected from any of: 
25 (a) SEQ ID Nos: 27 to 45; 
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(b) an immunogenic fragment comprising at least 12 
consecutive amino acids from a polypeptide of (a) ; and 

(c) a polypeptide of (a) or (b) which has been modified to 
improve its immunogenicity , wherein said modified 

5 polypeptide is at least 75% identical in amino acid 

sequence to the corresponding polypeptide of (a) or 
(b) . 

14. A fusion polypeptide comprising a polypeptide of claim 

10 12 or 13 and an additional polypeptide. 



15. A method for producing a polypeptide of claim 12 or 
13, comprising the step of culturing a unicellular host 
according to claim 9. 

15 

16. An antibody against the polypeptide of any one of 
claims 12 to 14. 



17. A vaccine comprising at least one first polypeptide 

20 according to any one of claims 12 to 14 and a 

pharmaceutical^ acceptable carrier, optionally comprising 
a second polypeptide which enhances the immune response to 
the first polypeptide. 

25 18. The vaccine of claim 17 wherein the second polypeptide 

comprises an additional Chlamydia polypeptide. 
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19. A pharmaceutical composition comprising a polypeptide 
according to any one of claims 12 to 14 and a 
pharmaceutical^ acceptable carrier. 

5 

20. A pharmaceutical composition comprising a vaccine 
according to claim 17 or 18 and a pharmaceutical^ 
acceptable carrier. 

10 21. A pharmaceutical composition comprising an antibody 

according to claim 16 and a pharmaceutical^ acceptable 
carrier. 



22. A method for preventing or treating Chlamydia 
15 infection using: 

(a) the nucleic acid of any one of claims 1 to 4; 

(b) the vaccine of any one of claims 5, 6, 17 and 18; 

(c) the pharmaceutical composition of any one of claims 7, 
8, 19 to 21; 

20 (d) the polypeptide of any one of claims 12 to 14; or 

(e) the antibody of claim 16. 

23. A method of detecting Chlamydia infection comprising 
the step of assaying a body fluid of a mammal to be tested, 

25 with a component selected from any one of: 

(a) the nucleic acid of any one of claims 1 to 4; 
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(b) the polypeptide of any one of claims 12 to 14; and 

(c) the antibody of claim 16. 



24. A diagnostic kit comprising instructions for use and 

component selected from any one of: 

(a) the nucleic acid of any one of claims 1 to 4; 

(b) the polypeptide of any one of claims 12 to 14; and 
the antibody of claim 16. 
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Figure 1: CPN10 03 97 

attttaacgt gcgtatcatt tgtgactaag agatagactt gctttcttta tctatcttct 60 

gtattggaaa gaaagcccct tgagggaaaa aaaggttgtt atg aag att cca etc 115 

Met Lys lie Pro Leu 



cgc ttt tta ttg ata tea tta gta cct acg ctt tct atg teg aat tta 
Arg Phe Leu Leu He Ser Leu Val Pro Thr Leu Ser Met Ser Asn Leu 



tta gga get get act acc gaa gag tta teg get age aat age ttc gat 
Leu Gly Ala Ala Thr Thr Glu Glu Leu Ser Ala Ser Asn Ser Phe Asp 
25 30 35 

gga act aca tea aca aca age ttt tct agt aaa aca tea teg get aca 
Gly Thr Thr Ser Thr Thr Ser Phe Ser Ser Lys Thr Ser Ser Ala Thr 
40 45 50 

gat ggc acc aat tat gtt ttt aaa gat tct gta gtt ata gaa aat gta 
Asp Gly Thr Asn Tyr Val Phe Lys Asp Ser Val Val He Glu Asn Val 



ccc aaa aca ggg gaa act cag tct act agt tgt ttt aaa aat gac get 
Pro Lys Thr Gly Glu Thr Gin Ser Thr Ser Cys Phe Lys Asn Asp Ala 



gca get gga gat eta aat ttc tta gga ggg gga ttt tct ttc aca ttt 
Ala Ala Gly Asp Leu Asn Phe Leu Gly Gly Gly Phe Ser Phe Thr Phe 
90 95 100 

age aat ate gat gca acc acg get tct gga get get att gga agt gaa 
Ser Asn He Asp Ala Thr Thr Ala Ser Gly Ala Ala He Gly Ser Glu 
105 HO H5 

gca get aat aag aca gtc acg tta tea gga ttt teg gca ctt tct ttt 
Ala Ala Asn Lys Thr Val Thr Leu Ser Gly Phe Ser Ala Leu Ser Phe 
120 125 130 

ctt aaa tec cca gca agt aca gtg act aat gga ttg gga get ate aat 
Leu Lys Ser Pro Ala Ser Thr Val Thr Asn Gly Leu Gly Ala He Asn 
135 140 145 

gtt aaa ggg aat tta age eta ttg gat aat gat aag gta ttg att cag 
Val Lys Gly Asn Leu Ser Leu Leu Asp Asn Asp Lys Val Leu He Gin 
150 155 160 165 

gac aat ttc tea aca gga gat ggc gga gca att aat tgt gca ggc tec 
Asp Asn Phe Ser Thr Gly Asp Gly Gly Ala He Asn Cys Ala Gly Ser 
170 175 180 

ttg aag ate gca aac aat aag tec ctt tct ttt att gga aat agt tct 
Leu Lys He Ala Asn Asn Lys Ser Leu Ser Phe He Gly Asn Ser Ser 
185 190 195 



1/165 
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Fig. 1 (con't) 

tea aca cgt ggc gga gcg att cat acc aaa aac etc aca eta tct tct 

Ser Thr Arg Gly Gly Ala He His Thr Lys Asn Leu Thr Leu Ser Ser 
200 205 210 

ggt ggg gaa act eta ttt cag ggg aat aca gcg cct acg get get ggt 

Gly Gly Glu Thr Leu Phe Gin Gly Asn Thr Ala Pro Thr Ala Ala Gly 
215 220 225 

aaa gga ggt get ate gcg att gca gac tct ggc acc eta tec att tct 

Lys Gly Gly Ala He Ala He Ala Asp Ser Gly Thr Leu Ser He Ser 

230 235 240 245 

gga gac agt ggc gac att ate ttt gaa ggc aat acg ata gga get aca 

Gly Asp Ser Gly Asp He He Phe Glu Gly Asn Thr He Gly Ala Thr 
250 255 260 

gga acc gtc tct cat agt get att gat tta gga act age get aag ata 

Gly Thr Val Ser His Ser Ala He Asp Leu Gly Thr Ser Ala Lys He 
265 270 275 

act gcg tta cgt get gcg caa gga cat acg ata tac ttt tat gat ccg 

Thr Ala Leu Arg Ala Ala Gin Gly His Thr He Tyr Phe Tyr Asp Pro 
280 285 290 

att act gta aca gga teg aca tct gtt get gat get etc aat att aat 

He Thr Val Thr Gly Ser Thr Ser Val Ala Asp Ala Leu Asn He Asn 
295 300 305 

age cct gat act gga gat aac aaa gag tat acg gga acc ata gtc ttt 

Ser Pro Asp Thr Gly Asp Asn Lys Glu Tyr Thr Gly Thr He Val Phe 

310 315 320 325 

tct gga gag aag etc acg gag gca gaa get aaa gat gag aag aac cgc 
Ser Gly Glu Lys Leu Thr Glu Ala Glu Ala Lys Asp Glu Lys Asn Arg 
330 335 340 

act tct aaa tta ctt caa aat gtt get ttt aaa aat ggg act gta gtt 

Thr Ser Lys Leu Leu Gin Asn Val Ala Phe Lys Asn Gly Thr Val Val 
345 350 355 

tta aaa ggt gat gtc gtt tta agt gcg aac ggt ttc tct cag gat gca 

Leu Lys Gly Asp Val Val Leu Ser Ala Asn Gly Phe Ser Gin Asp Ala 
360 365 370 

aac tct aag ttg att atg gat tta ggg acg teg ttg gtt gca aac acc 
Asn Ser Lys Leu He Met Asp Leu Gly Thr Ser Leu Val Ala Asn Thr 
375 380 385 

gaa agt ate gag tta acg aat ttg gaa att aat ata gac tct etc agg 
Glu Ser He Glu Leu Thr Asn Leu Glu He Asn He Asp Ser Leu Arg 

390 395 400 405 
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aac ggg aaa aag ata aaa etc agt get gec aca get cag aaa gat att 

Asn Gly Lys Lys He Lys Leu Ser Ala Ala Thr Ala Gin Lys Asp He 
410 415 420 

cgt ata gat cgt cct gtt gta ctg gca att age gat gag agt ttt tat 
Arg He Asp Arg Pro Val Val Leu Ala He Ser Asp Glu Ser Phe Tyr 
425 430 435 

caa aat ggc ttt ttg aat gag gac cat tec tat gat ggg att ctt gag 
Gin Asn Gly Phe Leu Asn Glu Asp His Ser Tyr Asp Gly He Leu Glu 
440 445 450 

tta gat get ggg aaa gac ate gtg att tct gca gat tct cgc agt ata 
Leu Asp Ala Gly Lys Asp He Val He Ser Ala Asp Ser Arg Ser He 
455 460 465 

gat get gta caa tct ccg tat ggc tat cag gga aag tgg acg ate aat 
Asp Ala Val Gin Ser Pro Tyr Gly Tyr Gin Gly Lys Trp Thr He Asn 
470 475 480 485 

tgg tct act gat gat aag aaa get acg gtt tct tgg gcg aag cag agt 
Trp Ser Thr Asp Asp Lys Lys Ala Thr Val Ser Trp Ala Lys Gin Ser 
490 495 500 

ttt aat ccc act get gag cag gag get ccg tta gtt cct aat ctt ctt 
Phe Asn Pro Thr Ala Glu Gin Glu Ala Pro Leu Val Pro Asn Leu Leu 
505 510 515 

tgg ggt tct ttt ata gat gtt cgt tec ttc cag aat ttt ata gag eta 
Trp Gly Ser Phe He Asp Val Arg Ser Phe Gin Asn Phe He Glu Leu 
520 525 530 

ggt act gaa ggt get cct tac gaa aag aga ttt tgg gtt gca ggc att 
Gly Thr Glu Gly Ala Pro Tyr Glu Lys Arg Phe Trp Val Ala Gly He 
535 540 545 

tec aat gtt ttg cat agg age ggt cgt gaa aat caa agg aaa ttc cgt 
Ser Asn Val Leu His Arg Ser Gly Arg Glu Asn Gin Arg Lys Phe Arg 
550 555 560 565 

cat gtg agt gga ggt get gta gta ggt get age acg agg atg ccg ggt 
His Val Ser Gly Gly Ala Val Val Gly Ala Ser Thr Arg Met Pro Gly 
570 575 580 

ggt gat acc ttg tct ctg ggt ttt get cag etc ttt gcg cgt gac aaa 
Gly Asp Thr Leu Ser Leu Gly Phe Ala Gin Leu Phe Ala Arg Asp Lys 
585 590 595 

gac tac ttt atg aat acc aat ttc gca aag acc tac gca gga tct tta 
Asp Tyr Phe Met Asn Thr Asn Phe Ala Lys Thr Tyr Ala Gly Ser Leu 
600 605 610 
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cgt ttg cag cac gat get tec eta tac tct gtg gtg agt ate ctt tta 

Arg Leu Gin His Asp Ala Ser Leu Tyr Ser Val Val Ser He Leu Leu 
615 620 625 

gga gag gga gga etc cgc gag ate ctg ttg cct tat gtt tec aag act 
Gly Glu Gly Gly Leu Arg Glu He Leu Leu Pro Tyr Val Ser Lys Thr 
630 635 640 645 

ctg ccg tgc tct ttc tat ggg cag ctt age tac ggc cat acg gat cat 
Leu Pro Cys Ser Phe Tyr Gly Gin Leu Ser Tyr Gly His Thr Asp His 
650 655 660 

cgc atg aag acc gag tct eta ccc ccc ccc ccc ccg acg etc teg acg 
Arg Met Lys Thr Glu Ser Leu Pro Pro Pro Pro Pro Thr Leu Ser Thr 
665 670 675 

gat cat act tct tgg gga gga tat gtc tgg get gga gag ctg gga act 
Asp His Thr Ser Trp Gly Gly Tyr Val Trp Ala Gly Glu Leu Gly Thr 
680 685 690 

cga gtt get gtt gaa aat acc age ggc aga gga ttt ttc caa gag tac 
Arg Val Ala Val Glu Asn Thr Ser Gly Arg Gly Phe Phe Gin Glu Tyr 
695 700 705 

act cca ttt gta aaa gtc caa get gtt tac get cgc caa gat age ttt 
Thr Pro Phe Val Lys Val Gin Ala Val Tyr Ala Arg Gin Asp Ser Phe 
710 715 720 725 

gta gaa eta gga get ate agt cgt gat ttt agt gat teg cat ctt tat 
Val Glu Leu Gly Ala He Ser Arg Asp Phe Ser Asp Ser His Leu Tyr 
730 735 740 

aac ctt gcg att cct ctt gga ate aag tta gag aaa egg ttt gca gag 
Asn Leu Ala He Pro Leu Gly He Lys Leu Glu Lys Arg Phe Ala Glu 
745 750 755 

caa tat tat cat gtt gta gcg atg tat tct cca gat gtt tgt cgt agt 
Gin Tyr Tyr His Val Val Ala Met Tyr Ser Pro Asp Val Cys Arg Ser 
760 765 770 

aac ccc aaa tgt acg act acc eta ctt tec aac caa ggg agt tgg aag 
Asn Pro Lys Cys Thr Thr Thr Leu Leu Ser Asn Gin Gly Ser Trp Lys 
775 780 785 

acc aaa ggt teg aac tta gca aga cag get ggt att gtt cag gee tea 
Thr Lys Gly Ser Asn Leu Ala Arg Gin Ala Gly He Val Gin Ala Ser 
790 795 800 805 

ggt ttt cga tct ttg gga get gca gca gag ctt ttc ggg aac ttt ggc 
Gly Phe Arg Ser Leu Gly Ala Ala Ala Glu Leu Phe Gly Asn Phe Gly 
810 815 820 
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Fig. 1 (con't) 

ttt gaa tgg egg gga tct tct cgt age tat aat gta gat gcg ggt age 2611 

Phe Glu Trp Arg Gly Ser Ser Arg Ser Tyr Asn Val Asp Ala Gly Ser 

825 830 835 

aaa ate aaa ttt tagegattte tctttcgatg ctatttttcc atggctattt 2663 
Lys lie Lys Phe 
840 

ttaaaatgat agccatggtt atagatacgt agtccttatt tcaaagaaga cactgttgca 2723 
ttagataege tctctgatcc ctcaaaa 2750 
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Figure 2 (RY-32) 

Restriction Enzyme analysis of CPN1 00397 

Mae I I Maelll Bael 

Msel | Tsp45I Ddel | MboII 

II III I 

ATTTTAACGTGCGTATCATTTGTGACTAAGAGATAGACTTGCTTTCTTTATCTATCTTCT 

1 + + + + + + 60 

TAAAATTGCACGCATAGTAAACACTGATTCTCTATCTGAACGAAAGAAATAGATAGAAGA 

BslI 
Smll 

CviJI | Hinfl Acil 

Mnll I Bce83I Tfil MboII 

II III 
GTATTGGAAAGAAAGCCCCTTGAGGGAAAAAAAGGTTGTTATGAAGATTCCACTCCGCTT 
CI 4- -.- + + + + 120 



Apol 

Bbvl Fnu4HI 
Tsp509I Alul| 

AceIIl| CviJI | 
EcoRV Rsal Taql|| Tsel j Earl 

I I III II I 

TTTATTGATATCATTAGTACCTACGCTTTCTATGTCGAATTTATTAGGAGCTGCTACTAC 

121 + + + - --- + + + 180 

AAATAACTATAGTAATCATGGATGCGAAAGATACAGCTTAAATAATCCTCGACGATGATG 



Cac8I 
Bfal | 
CviJI | j 
MboII j j 
Nhel | | 
III 



TaqI 
Alul | 
CviJI |Bcc 
I I 



Alul 
CviJI 
Bcgl Hindi I I | 

I I 



CGAAGAGTTATCGGCTAGCAATAGCTTCGATGGAACTACATCAACAACAAGCTTTTCTAG 

181 + + + + + + 240 

GCTTCTCAATAGCCGATCGTTATCGAAGCTACCTTGATGTAGTTGTTGTTCGAAAAGATC 



Sfcl 
CviJI | 

I I 



Tsp509I 
NlalV | 
BanI | j 
Bed | j j 
II I I 



Hinfl 
Dral | Sfcl 
Msel | Tfil | 

II 



TAAAACATCATCGGCTACAGATGGCACCAATTATGTTTTTAAAGATTCTGTAGTTATAGA 

241 + + + + + + 300 

ATTTTGTAGTAGCCGATGTCTACCGTGGTTAATACAAAAATTTCTAAGACATCAATATCT 
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Rsal 

I 



BslI 
BslI | 
II 



Dral 
Msel 
Bbvl 
BseMII | 
Bfal | 
Spel| | 
Ddel AccI | I | 
III I 



Alul 
CviJI 
Hgal 
MspAlI 
PvuII 
PstI | 
Fnu4HI | 
CviRI | 
Tsel j 
Fnu4HI | | 
Sfcl j | 
Tsel | | | 
II II 



AAATGTACCCAAAACAGGGGAAACTCAGTCTACTAGTTGTTTTAAAAATGACGCTGCAGC 



TTTACATGGGTTTTGTCCCCTTTGAGTCAGATGATCAACAAAATTTTTACTGCGACGTCG 



Apol 
Tsp509I 
Bbvl | 
Dpnl 
Bglll | 
BstYI j 
Sau3AI j 
I I 



Ddel 

| Mnl I | Bpml SfaNI 

III I 
TGGAGATCTAAATTTCTTAGGAGGGGGATTTTCTTTCACATTTAGCAATATCGATGCAAC 

361 + + + + + + 420 

ACCTCTAGATTTAAAGAATCCTCCCCCTAAAAGAAAGTGTAAATCGTTATAGCTACGTTG 



Acelll 
BsaJI 
BstDSI 
CviRI | 
Cjel | | 
Clal | | | 
Taql | | | 
II I I 



Bcefl 



Fnu4HI 
Alul | 

CviJI | 
Tsel j 



Hpyl78III 
Maell | 
Maelll | j 



Alul 

Mwol | | | CviJI 
Hpyl78III | jj j Fnu4HI | 

CviJI Ml j Tsel | j Taal | | 

Mwol | Ml I Bpml | | j Tsp45I | | 

Bbvl | | Ml | Cjel | | | | Bbvl | | | 

II I II I I I I II I III I 

CACGGCTTCTGGAGCTGCTATTGGAAGTGAAGCAGCTAATAAGACAGTCACGTTATCAGG 

421 + + + + + + 480 

GTGCCGAAGACCTCGACGATAACCTTCACTTCGTCGATTATTCTGTCAGTGCAATAGTCC 



Maelll 
Taal 
Tsp45I 

Rsal I Alul 
Msel TatI | |TspRI CviJI 

I I I I I I 

ATTTTCGGCACTTTCTTTTCTTAAATCCCCAGCAAGTACAGTGACTAATGGATTGGGAGC 

481 + + + + + + 540 

TAAAAGCCGTGAAAGAAAAGAATTTAGGGGTCGTTCATGTCACTGATTACCTAACCCTCG 
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Hin4I 
Cjel 
Tsp509I 

Apol Hpyl78III | 

Tsp509I CviJI Hinfl | j 

Msel iMsel I Tfil | | 

I I I I I I I 

TATCAATGTTAAAGGGAATTTAAGCCTATTGGATAATGATAAGGTATTGATTCAGGACAA 

541 + + + + + + 600 

ATAGTTACAATTTCCCTTAAATTCGGATAACCTATTACTATTCCATAACTAAGTCCTGTT 



Tsp509I 
Msel 
Vspl 
Tsp509I 
Ecil | 
Acil| | 
Bed | | j 
I II I 



NlalV 
CviJI 
Cac8I | 
CviRI | j 
:jel | I I 
I I I II 

TTTCTCAACAGGAGATGGCGGAGCAATTAATTGTGCAGGCTCCTTGAAGATCGCAAACAA 

601 + + + + + + 660 

AAAGAGTTGTCCTCTACCGCCTCGTTAATTAACACGTCCGAGGAACTTCTAGCGTTTGTT 



Dpnl 
BsmFl| MboII 
Sau3AI | |BsgI | 
III I I 



BsaAI 
Pmll 
Maell | 

XmnI AflHI | j Ecil Hinfl 
Tthlllll MboII | Bsbl ||Acil| Tfil 

I II I II II I 

TAAGTCCCTTTCTTTTATTGGAAATAGTTCTTCAACACGTGGCGGAGCGATTCATACCAA 

661 + + + + + + 720 

ATTCAGGGAAAGAAAATAACCTTTATCAAGAAGTTGTGCACCGCCTCGCTAAGTATGGTT 



Fnu4HI 
CviJI | 
Tsel j 
Mwol | j 
Bbvl Haell | | | 

MboII Mnll Cjel Hhal| | || 

I I 

AAACCTCACACTATCTTCTGGTGGGGAAACTCTATTTCAGGGGAATACAGCGCCTACGGC 

721 + + + + + + 780 

TTTGGAGTGTGATAGAAGACCACCCCTTTGAGATAAAGTCCCCTTATGTCGCGGATGCCG 
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NlalV 
BanI 
AlwNI 
Hinfl 
CviRI 



Hpyl78III 
Nrul 

Hin4I Thai 
Mnll Beef I |CjeI | 
I I I I I 



Hpyl78III 
BsmAI | 

I 



TGCTGGTAAAGGAGGTGCTATCGCGATTGCAGACTCTGGCACCCTATCCATTTCTGGAGA 

781 + + + + + + 840 

ACGACCATTTCCTCCACGATAGCGCTAACGTCTGAGACCGTGGGATAGGTAAAGACCTCT 

Sfcl Ppil 
TspRI Alul | Taal|BsmAI 

Taal I Bpml CviJI j NlalV ||BsmBI 

II I 

CAGTGGCGACATTATCTTTGAAGGCAATACGATAGGAGCTACAGGAACCGTCTCTCATAG 

841 + + + + + + 900 

GTCACCGCTGTAATAGAAACTTCCGTTATGCTATCCTCGATGTCCTTGGCAGAGAGTATC 

Hhal 
Fspl j 
Fnu4HI 
Tsel| 
BsaAI 
Mwol 
Maell | 
Maelll | j 
Bbvl | | j 
I I II 

TGCTATTGATTTAGGAACTAGCGCTAAGATAACTGCGTTACGTGCTGCGCAAGGACATAC 

901 + + + + + + 960 

ACGATAACTAAATCCTTGATCGCGATTCTATTGACGCAATGCACGACGCGTTCCTGTATG 



Ddel 
Haell 
Hhal | 
EC047III | | 
Bfal I I I 

I III 



Hpyl8 8IX 
Dpnl | 
Sau3AI | | 
Alwl | j | 

I III 



Alwl 
SfaNI 
BsaBI | 
Taql | | 
Maelll Dpnl | | | 

Taal Sau3AI | | | j 
I II I ■ 



GATATACTTTTATGATCCGATTACTGTAACAGGATCGACATCTGTTGCTGATGCTCTCAA 



CTATATGAAAATACTAGGCTAATGACATTGTCCTAGCTGTAGACAACGACTACGAGAGTT 
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Drdll 
NlaIV| 
BscGI 

Ms el Bpml| 
Vspl BstZ17I I I 

Sspl I CviJI BslI Bsrl Sthl32I AccI | || || Hpyl78III 

Ml II I II II II I 

TATTAATAGCCCTGATACTGGAGATAACAAAGAGTATACGGGAACCATAGTCTTTTCTGG 

1021 + + + + + + 1080 

ATAATTATCGGGACTATGACCTCTATTGTTTCTCATATGCCCTTGGTATCAGAAAAGACC 



Alul 
CviJI 
Mnll 
Hin4I | 



Alul 



CviJI Tsp509I 
Bpml | Acil MboII | 

. I I I I I I 

AGAGAAGCTCACGGAGGCAGAAGCTAAAGATGAGAAGAACCGCACTTCTAAATTACTTCA 

1081 + + + + + + H40 

TCTCTTCGAGTGCCTCCGTCTTCGATTTCTACTCTTCTTGGCGTGAAGATTTAATGAAGT 



BsmFI 

Dral Taal Dral | HphI 

Msel | Sfcl | Msel | | Msel 

II II III I 

AAATGTTGCTTTTAAAAATGGGACTGTAGTTTTAAAAGGTGATGTCGTTTTAAGTGCGAA 

H41 + + + + + + 1200 

TTTACAACGAAAATTTTTACCCTGACATCAAAATTTTCCACTACAGCAAAATTCACGCTT 



Hpyl78III 
SfaNI 
Ppil| 
Taal | | 
XmnI j j Ddel 
III I 



Fokl 
Ddel | 
BseMIl| | 
Hin4I | j 
CviRI | | j 

II II 



Aatll 
BsaHI | CviRI 
Mae I I | BsmFI | 

II II 

CGGTTTCTCTCAGGATGCAAACTCTAAGTTGATTATGGATTTAGGGACGTCGTTGGTTGC 

1201 + + + + + + 1260 

GCCAAAGAGAGTCCTACGTTTGAGATTCAACTAATACCTAAATCCCTGCAGCAACCAACG 



Apol 

Tsp509I Plel 
Hindi | Msel | 

Tthlllll Hpal | Vspl j 

Taql|Msel| j Tsp509I | | 

II II I II' 



Hpyl78III 
Ddel | 
Sthl32l| | 
Hinfl | j j BscGI 

I II I I 



AAACACCGAAAGTATCGAGTTAACGAATTTGGAAATTAATATAGACTCTCTCAGGAACGG 

1261 + + + + + + 1320 

TTTGTGGCTTTCATAGCTCAATTGCTTAAACCTTTAATTATATCTGAGAGAGTCCTTGCC 
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Bbvl 
BseMII | 



Hpyl8 8IX 
Ddel 
Alul | 
TspRI Cvi JI j 
Fnu4HI (BseMII | j 
Ddel Tsel | |MwoI | | | 
III I II 



Dpnl 
Sau3AI | 
BseMII | j 
Acelll | j | 
I I I 



GAAAAAGATAAAACTCAGTGCTGCCACAGCTCAGAAAGATATTCGTATAGATCGTCCTGT 

.321 + + + + + + : 

CTTTTTCTATTTTGAGTCACGACGGTGTCGAGTCTTTCTATAAGCATATCTAGCAGGACA 

BsrI 
Tsp509I 

Rsal | M nl1 

TatI I Hin4I CviJI | 

I II I I I I I 

TGTACTGGCAATTAGCGATGAGAGTTTTTATCAAAATGGCTTTTTGAATGAGGACCATTC 

.381 + + + + + + : 

ACATGACCGTTAATCGCTACTCTCAAAAATAGTTTTACCGAAAAACTTACTCCTGGTAAG 



Avail 
Sau96I MslI 



Hpyl78III 
Smll ' 
SfaNI 
Hinfl 
Tfil 
Bed | 
Bsll| | 
Xcml | | | 
I III 



Hpyl7 8III 
Bce83I | 
I I 



Hinfl 
Tfil 
PstI | 
CviRI | j 
Sfcl | | | SfaNI 
I I II I 



CTATGATGGGATTCTTGAGTTAGATGCTGGGAAAGACATCGTGATTTCTGCAGATTCTCG 
GATACTACCCTAAGAACTCAATCTACGACCCTTTCTGTAGCACTAAAGACGTCTAAGAGC 



Rsal 
BsrGI | 
TatI | 



Muni 
Tsp509I 
Dpnl | 

CviJI Sau3AI | | AccI 

, I I I I I I 

CAGTATAGATGCTGTACAATCTCCGTATGGCTATCAGGGAAAGTGGACGATCAATTGGTC 

+ + + + + + ] 

GTCATATCTACGACATGTTAGAGGCATACCGATAGTCCCTTTCACCTGCTAGTTAACCAG 

Mnll 
TspRI 

BseMII Bpull02l| 
Msel | Ddel j 

Hin4I | | BtsI | | 

| | II I I M 

TACTGATGATAAGAAAGCTACGGTTTCTTGGGCGAAGCAGAGTTTTAATCCCACTGCTGA 

ATGACTACTATTCTTTCGATGCCAAAGAACCCGCTTCGTCTCAAAATTAGGGTGACGACT 



Alul 
CviJI Taal 
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BsaXI 
NlalV 
CviJl| MboII 
II I 

GCAGGAGGCTCCGTTAGTTCCTAATCTTCTTTGGGGTTCTTTTATAGATGTTCGTTCCTT 

1621 + + + + + + 1680 

CGTCCTCCGAGGCAATCAAGGATTAGAAGAAACCCCAAGAAAATATCTACAAGCAAGGAA 

Apol Bfal 
Tsp509I Alul| BsiHKAI 

Hpyl78III | CviJI j Rsal Bspl286I Eco57I CviRI 

II III I I I 

CCAGAATTTTATAGAGCTAGGTACTGAAGGTGCTCCTTACGAAAAGAGATTTTGGGTTGC 

1681 + + + + + + 1740 

GGTCTTAAAATATCTCGATCCATGACTTCCACGAGGAATGCTTTTCTCTAAAACCCAACG 



Hpyl78III 
BsiEI | 

BsaXI | | MslI 
Acil | j | Apol Mnll| 

Cac8I CviRI BsrBI j j j Tsp509I Nlalll j 

I I I I I I I II 

AGGCATTTCCAATGTTTTGCATAGGAGCGGTCGTGAAAATCAAAGGAAATTCCGTCATGT 

1741 + + + + + + 1800 

TCCGTAAAGGTTACAAAACGTATCCTCGCCAGCACTTTTAGTTTCCTTTAAGGCAGTACA 



Sfcl 
BsaXI | 
I I 



MslI 
Sthl32I | 
BssSI 
Cac8I 
CjePI 
Bfal | 
Mnll | 
Sf aNI | 
Nhel | | 
III 



Neil 
ScrFI 
Mspl | 

II 



HaelV BsmAI 
Hin4I CjePI | 
Fokl | HphI | | 

I I II 



GAGTGGAGGTGCTGTAGTAGGTGCTAGCACGAGGATGCCGGGTGGTGATACCTTGTCTCT 

1801 + + + + + + 1860 

CTCACCTCCACGACATCATCCACGATCGTGCTCCTACGGCCCACCACTATGGAACAGAGA 



Alul 
CviJI 
Bpull02I | 
Ddel j 
I 



Acelll 
BseMII 
Maelll 
Tsp45I 
Hhal | 
Thai j 
Mwol | j 
I I I 



Tsp509I 

I 

GGGTTTTGCTCAGCTCTTTGCGCGTGACAAAGACTACTTTATGAATACCAATTTCGCAAA 

1861 + + + + + + 1920 

CCCAAAACGAGTCGAGAAACGCGCACTGTTTCTGATGAAATACTTATGGTTAAAGCGTTT 
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Fnu4HI 
CviRI | 
Tsel | 

Dpnl SfaNI | | 
BstYI | Alwl | j I 
Sau3AI j Maell j | | Bbvl 

II I I II I 

GACCTACGCAGGATCTTTACGTTTGCAGCACGATGCTTCCCTATACTCTGTGGTGAGTAT 

21 + + + + + + : 

CTGGATGCGTCCTAGAAATGCAAACGTCGTGCTACGAAGGGATATGAGACACCACTCATA 

Dpnl 
BstYI | 
Sau3AI 

BciVI Thai | 

Mnll Acil | j 

HphI | Alwl III Pie I 

Mnll jplel Hinfl | | j j Beef I | Hinfl 

III I I I I I M I 

CCTTTTAGGAGAGGGAGGACTCCGCGAGATCCTGTTGCCTTATGTTTCCAAGACTCTGCC 

81 + + + + + + 2040 

GGAAAATCCTCTCCCTCCTGAGGCGCTCTAGGACAACGGAATACAAAGGTTCTGAGACGG 



BsiHKAI 
Bspl286I 



CviJI 
Haelll 
Bbvl 
Eael 
Gdill 
Alul 



Bpull02I 
Ddel 
Alul | 
CviJI | 
Fnu4HI | | 
Tsel| || 
II II 



I 



Alwl 
Bcefl | 
Dpnl | | 
Sau3AI | | | 

II II 



Bbsl 
Hin4I 
Hinfl 
Tthllll | 
Nlalll 
I 



GTGCTCTTTCTATGGGCAGCTTAGCTACGGCCATACGGATCATCGCATGAAGACCGAGTC 



CACGAGAAAGATACCCGTCGAATCGATGCCGGTATGCCTAGTAGCGTACTTCTGGCTCAG 



Taqll 
Plel 
BsmAI | 
MboII | j 

I II 



Dpnl 
Sau3AI 
Hgal 
Hpyl78III 
TaqI 
Sthl32I I 



Mnll 
Alwl | 

I ' 



Acelll 
BsaXI | 
Hin4I | 

I 



TCTACCCCCCCCCCCCCCGACGCTCTCGACGGATCATACTTCTTGGGGAGGATATGTCTG 

2ioi + + + + + + 2160 

AGATGGGGGGGGGGGGGGCTGCGAGAGCTGCCTAGTATGAAGAACCCCTCCTATACAGAC 
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Fnu4HI 

TaqI Taul 
Aval| Acil| 
Alul Smll| MspAllj 
CviJI CviJI Xhol | Bpml Mnll | j 

I I II I Ml 

GGCTGGAGAGCTGGGAACTCGAGTTGCTGTTGAAAATACCAGCGGCAGAGGATTTTTCCA 

2161 + + + + + + 2220 

CCGACCTCTCGACCCTTGAGCTCAACGACAACTTTTATGGTCGCCGTCTCCTAAAAAGGT 

Alul 

Rsal Alul Cac8I CviJI 

TatI | CviJI Mwol | Mwol | 

II I I I I I 

AGAGTACACTCCATTTGTAAAAGTCCAAGCTGTTTACGCTCGCCAAGATAGCTTTGTAGA 

2221 + + + + + + 2280 

TCTCATGTGAGGTAAACATTTTCAGGTTCGACAAATGCGAGCGGTTCTATCGAAACATCT 

BsaBI 

Alul Hinfl | CjePI Hinfl 

Bfal CviJI Hpyl78III Tfil j SfaNI Tfil 

II I II I I 
ACTAGGAGCTATCAGTCGTGATTTTAGTGATTCGCATCTTTATAACCTTGCGATTCCTCT 

2281 + + + + + + 2340 

TGATCCTCGATAGTCAGCACTAAAATCACTAAGCGTAGAAATATTGGAACGCTAAGGAGA 

Mnll 

Hinfl | CviRI Bpml Tthlllll 

Tfil j CjePI Taal | Sspl Nlalll | Hin4l| 

II III I M M 

TGGAATCAAGTTAGAGAAACGGTTTGCAGAGCAATATTATCATGTTGTAGCGATGTATTC 

2341 + + + + + + 2400 

ACCTTAGTTCAATCTCTTTGCCAAACGTCTCGTTATAATAGTACAACATCGCTACATAAG 

BslI 

Maelll BsaJI | 

Hpyl78III PflllOSI | Rsal Mmel Styl j 

I II I I M 

TCCAGATGTTTGTCGTAGTAACCCCAAATGTACGACTACCCTACTTTCCAACCAAGGGAG 

2401 + + + + + + 2460 

AGGTCTACAAACAGCATCATTGGGGTTTACATGCTGATGGGATGAAAGGTTGGTTCCCTC 

BSU36I 

NspV Ddel 
TaqI CviJI | 

MboIl| Hael | 

Mmel | | CviJI Haelll j 

Bbsl| || Ddel Mwol | StuI j Acelll 

II II I II M I 

TTGGAAGACCAAAGGTTCGAACTTAGCAAGACAGGCTGGTATTGTTCAGGCCTCAGGTTT 

2461 + + + + + + 2520 

AACCTTCTGGTTTCCAAGCTTGAATCGTTCTGTCCGACCATAACAAGTCCGGAGTCCAAA 
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. 2 (con't) 



BseMII 

Dpnl | 

Sau3AI | j 

Bbvl | j j 

Mnllj | j 

Taql| | | 
II ' 



Hpyl7 8III 
Alul | 
CviJI j 
Sthl32I 
PstI 



Fnu4HI | 
CviRI | 
Tsel | 
Fnu4HI 
Sfcl 
Alul | 
CviJI | 
Tsel | 
II 



| Bbvl 
I 



Dpnl 
BstYI | 
Faul Sau3AI | 

Sthl32l| Acil | j 
CviJI | j MboII | j j 
III II I I 



TCGATCTTTGGGAGCTGCAGCAGAGCTTTTCGGGAACTTTGGCTTTGAATGGCGGGGATC 

2521 + + + + + + 2580 

AGCTAGAAACCCTCGACGTCGTCTCGAAAAGCCCTTGAAACCGAAACTTACCGCCCCTAG 



Faul 
Sthl32l| 
SfaNI 
Alul | 
CviJI j 
Pflll08I | j 
Alwl | j j 
I I II 



Apol 
Tsp509I 



SfaNI 



TaqI 



Acil 

I I I I 

TTCTCGTAGCTATAATGTAGATGCGGGTAGCAAAATCAAATTTTAGCGATTTCTCTTTCG 

2581 + + + + + + 2640 

AAGAGCATCGATATTACATCTACGCCCATCGTTTTAGTTTAAAATCGCTAAAGAGAAAGC 



CviJI 
Nlalll 
BsaJI 
BstDSI 
Ncol 
Styl 



Nlalll 
BsaJI 
BstDSI 
Ncol 
Styl 
CviJI | 



Dral 
Msel | 

. .. II 

ATGCTATTTTTCCATGGCTATTTTTAAAATGATAGCCATGGTTATAGATACGTAGTCCTT 

2641 + + + + + + 2700 

TACGATAAAAAGGTACCGATAAAAATTTTACTATCGGTACCAATATCTATGCATCAGGAA 



BsaAI 
HaelV 
Hin4I 
SnaBI 
Maell| 
II 



CviRI 
MboII 
TspRI | 
Taal | | 
Bbsl| || 
II II 



Dpnl 
Sau3AI | 
Hpyl88IX| 
Alwl | j 
Hin4I | | j 
' I II 



ATTTCAAAGAAGACACTGTTGCATTAGATACGCTCTCTGATCCCTCAAAA 

2701 + + + + + 2 750 

TAAAGTTTCTTCTGTGACAACGTAATCTATGCGAGAGACTAGGGAGTTTT 
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Figure 3: CPN1 00421 

ctcctgtccc tcgcgttgtc aacctacccc tcctccctcg aattctaatc ctttgaacgt 60 

agtacaacag cctgttgctg catcgtcagt gccttcctac atg ccc cca ctg aat 115 

Met Pro Pro Leu Asn 



get gat gat gtt etc cct aga gac cat ctg tea gat gga agt ttc tea 
Ala Asp Asp Val Leu Pro Arg Asp His Leu Ser Asp Gly Ser Phe Ser 



gat acg tat cca gac att aca acg caa gcg ate ate tta att ttc ttg 
Asp Thr Tyr Pro Asp He Thr Thr Gin Ala He He Leu He Phe Leu 



gee eta teg cct ttc ctg gtc atg ttg etc act teg tat eta aag att 
Ala Leu Ser Pro Phe Leu Val Met Leu Leu Thr Ser Tyr Leu Lys He 



ate att act tta gtc tta tta cgt aac gec tta gga gta caa caa aca 
He He Thr Leu Val Leu Leu Arg Asn Ala Leu Gly Val Gin Gin Thr 



cct ccc agt caa gtc etc aat ggg att gca etc ate eta tct att tat 
Pro Pro Ser Gin Val Leu Asn Gly He Ala Leu He Leu Ser He Tyr 



gtg atg ttc ccc acg gga gtg get atg tat aaa gat get cgc aag gaa 
Val Met Phe Pro Thr Gly Val Ala Met Tyr Lys Asp Ala Arg Lys Glu 
90 95 100 

ate gaa gee aat ace att cct caa age etc ttc act gca gaa ggt get 
He Glu Ala Asn Thr He Pro Gin Ser Leu Phe Thr Ala Glu Gly Ala 
105 HO H5 

gaa aca gtg ttt gtc get tta aac aaa tct aaa gaa cct ttg cgc tct 
Glu Thr Val Phe Val Ala Leu Asn Lys Ser Lys Glu Pro Leu Arg Ser 
120 125 130 

ttc tta att cgc aac act cca aaa gca caa att caa age ttt tac aag 
Phe Leu He Arg Asn Thr Pro Lys Ala Gin He Gin Ser Phe Tyr Lys 
135 140 145 

ate tea cag aaa ace ttc cct teg gaa att cga gcg cac etc act gee 
He Ser Gin Lys Thr Phe Pro Ser Glu He Arg Ala His Leu Thr Ala 
150 155 160 165 

tec gac ttt gta ate att att cct get ttt att atg ggt cag ata aaa 
Ser Asp Phe Val He He He Pro Ala Phe He Met Gly Gin He Lys 
170 175 180 

aat get ttc gaa att gga gtc ttg ate tat eta cct ttc ttt gtt att 
Asn Ala Phe Glu He Gly Val Leu He Tyr Leu Pro Phe Phe Val He 
185 190 195 
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Fig. 3 (con't) 

gat tta gtg act get aac gtt ctt gta gcg atg cag atg atg atg tta 73 9 

Asp Leu Val Thr Ala Asn Val Leu Val Ala Met Gin Met Met Met Leu 
200 205 210 

tec cct eta teg att teg tta cct tta aag tta ctt ttg ate gtc atg 787 

Ser Pro Leu Ser lie Ser Leu Pro Leu Lys Leu Leu Leu lie Val Met 
215 220 225 

gta gac gga tgg aca tta ctg etc caa ggg ctt atg ate age ttt aaa 835 

Val Asp Gly Trp Thr Leu Leu Leu Gin Gly Leu Met lie Ser Phe Lys 

230 235 240 245 

taaggacacg tgccgtgtta gcatttttcg caactagttt caaatctgtt ctttttgagt 895 

actcctacca atcattatta cttattttga ttgtttcggc acctcccatc atcttagctt 955 

ecatagtegg gattatggtt gcgatcttcc aagccgcaac acaaa 1000 
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Figure 4 (RY-34) 

Restriction enzyme analysis of CP 100421 



BseRI 
Hindi | 
Thai Mnll | ! 

Ill 



BsaXI 

I 



Mnll 
Apol 
EcoRI 
Tsp509I 
Mnll | 
Taql | | 
III 



Mnll 
I 



Maell 

I 



CTCCTGTCCCTCGCGTTGTCAACCTACCCCTCCTCCCTCGAATTCTAATCCTTTGAACGT 



GAGGACAGGGAGCGCAACAGTTGGATGGGGAGGAGGGAGCTTAAGATTAGGAAACTTGCA 

Bbvl CviRI 
Rsal| Fnu4HI | TspRI Nlalll BsmI 

TatI | | CviJI Tsel | j Sf aNI | Nspl TspRI | 

I II I II I II I II 

AGTACAACAGCCTGTTGCTGCATCGTCAGTGCCTTCCTACATGCCCCCACTGAATGCTGA 



61 



TCATGTTGTCGGACAACGACGTAGCAGTCACGGAAGGATGTACGGGGGTGACTTACGACT 



Bfal 
Bsal | 
BsmAI I 

I I 



Hpyl8 8IX 
Ahdl I 
HaeIV| j 
Hin4l| j 
Bed | |XcmI j Bed 
III II I 



BseMII 
Hpyl78III 
BsaAI 
BsaBI 
Hpyl8 8IX SnaBI 
Ddel |MaeIl| 
I I II 



TGATGTTCTCCCTAGAGACCATCTGTCAGATGGAAGTTTCTCAGATACGTATCCAGACAT 

121 + + + + + + 180 

ACTACAAGAGGGATCTCTGGTAGACAGTCTACCTTCAAAGAGTCTATGCATAGGTCTGTA 



Dpnl 
Sau3AI |Tsp509I 
Cac8I | | Msel| 

III II 



CviJI 
Haelll 
Sau96I | 
II 



ScrFI 
EcoRII | Nlalll 
' I I 



TACAACGCAAGCGATCATCTTAATTTTCTTGGCCCTATCGCCTTTCCTGGTCATGTTGCT 

181 + + + + + + 240 

ATGTTGCGTTCGCTAGTAGAATTAAAAGAACCGGGATAGCGGAAAGGACCAGTACAACGA 



BsaAI 
Maelll 

SnaBI Bsu36I Rsal 
Maell | Ddel TatI | 

II III 
CACTTCGTATCTAAAGATTATCATTACTTTAGTCTTATTACGTAACGCCTTAGGAGTACA 

241 + + + + + + 300 

GTGAAGCATAGATTTCTAATAGTAATGAAATCAGAATAATGCATTGCGGAATCCTCATGT 
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Fig. 4 (con't) 



Mnll 
Tthlllll 
Bmrl | CviRI 
Bsrl| | Fokl Mnll | Sthl32I 

II I I I I I 

ACAAACACCTCCCAGTCAAGTCCTCAATGGGATTGCACTCATCCTATCTATTTATGTGAT 

301 + + + + + + 360 

TGTTTGTGGAGGGTCAGTTCAGGAGTTACCCTAACGTGAGTAGGATAGATAAATACACTA 



SfaNI 
CviJI | 
BstXI 
MslI 



BSCGI | 
BsaJI | I 
BstDSI j j 
II 



Bcgl 

I 



Cac8I 

I 



TaqI 
CjePI | CviJI 
I I I 



Bcgl 

I 



GTTCCCCACGGGAGTGGCTATGTATAAAGATGCTCGCAAGGAAATCGAAGCCAATACCAT 



CAAGGGGTGCCCTCACCGATACATATTTCTACGAGCGTTCCTTTAGCTTCGGTTATGGTA 



Tthlllll 
BstAPI 
PstI 
TspRI 
CviRI 



Mnll 
BtsI 
Sfcl 
Earl 
CjePI 
Mnll | 
CviJI | j 
MboII | | I 
I II' 



Mwol 

I 



TspRI 
Taal | 

I I 



Dral 
Msel | 
II 



TCCTCAAAGCCTCTTCACTGCAGAAGGTGCTGAAACAGTGTTTGTCGCTTTAAACAAATC 
AGGAGTTTCGGAGAAGTGACGTCTTCCACGACTTTGTCACAAACAGCGAAATTTGTTTAG 



Tsp509I 
Hhal Msel | 
I II 



Bsbl 
I 



Alul 
CviJI 
Hindlll | 
Apol | | 

Tsp509I | | 

I I ' 



TAAAGAACCTTTGCGCTCTTTCTTAATTCGCAACACTCCAAAAGCACAAATTCAAAGCTT 



ATTTCTTGGAAACGCGAGAAAGAATTAAGCGTTGTGAGGTTTTCGTGTTTAAGTTTCGAA 
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. 4 (con't) 



Dpnl 
Bglll | 
BstYI | 
Sau3AI | 
I I 



Apol 

Tsp509I Hhal 
Hpyl8 8IX |TaqI Hin4I 

III I 



Hpyl88IX 
Mnll | 
TspRI | | 
Btsl | | | 
III 



TTACAAGATCTCACAGAAAACCTTCCCTTCGGAAATTCGAGCGCACCTCACTGCCTCCGA 



AATGTTCTAGAGTGTCTTTTGGAAGGGAAGCCTTTAAGCTCGCGTGGAGTGACGGAGGCT 

Tsp509I 

Mnll Hpyl88IX NspV | 

Bpll I Mmel SimI I TaqI j Hinfl 

I I I I I I I I 

CTTTGTAATCATTATTCCTGCTTTTATTATGGGTCAGATAAAAAATGCTTTCGAAATTGG 

601 + + + + + + f 

GAAACATTAGTAATAAGGACGAAAATAATACCCAGTCTATTTTTTACGAAAGCTTTAACC 

Dpnl 
Plel 

Sau3AI | SfaNI 
Hpyl78IIl| | Maelll Acll | 

Hin4l|| | Tsp45I Maell | 

III I I I I 

AGTCTTGATCTATCTACCTTTCTTTGTTATTGATTTAGTGACTGCTAACGTTCTTGTAGC 

661 + + + + + + 

TCAGAACTAGATAGATGGAAAGAAACAATAACTAAATCACTGACGATTGCAAGAACATCG 



CviRI 
I 



Maelll 
Mnll | 
Clal | | 
TaqI | | 
I I I 



Maelll 
Dral | 
Msel| I 
II I 



Dpnl 
Sau3AI | 
I I 



GATGCAGATGATGATGTTATCCCCTCTATCGATTTCGTTACCTTTAAAGTTACTTTTGAT 

CTACGTCTACTACTACAATAGGGGAGATAGCTAAAGCAATGGAAATTTCAATGAAAACTA 

Beef I 
Dral 
Msel 
Alul 
CviJI 
Dpnl | 
Bell 
Sau3AI 
CviJI | 
Mwol 
BsaJI | 

AccI Styl j 

Nlalll | Bed Fokl | | 

II I III 

CGTCATGGTAGACGGATGGACATTACTGCTCCAAGGGCTTATGATCAGCTTTAAATAAGG 

L + + + + + + 1 

GCAGTACCATCTGCCTACCTGTAATGACGAGGTTCCCGAATACTAGTCGAAATTTATTCC 
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Fig. 4 (con't) 

BsaAI 

Prall RsaI 

Maell| Bfal Seal 

AflHI I Mw oI Mwol Spel| TatI | 

I II I I M I I 

ACACGTGCCGTGTTAGCATTTTTCGCAACTAGTTTCAAATCTGTTCTTTTTGAGTACTCC 

841 + + + + + + 900 

TGTGCACGGCACAATCGTAAAAAGCGTTGATCAAAGTTTAGACAAGAAAAACTCATGAGG 

Sthl32I 
Alul | 
CviJI j 

NlalV Ddel | j 

BanI | Bed Mnl 1 1 j | 

II I II I I 

TACCAATCATTATTACTTATTTTGATTGTTTCGGCACCTCCCATCATCTTAGCTTCCATA 

901 + + + + + + 960 

ATGGTTAGTAATAATGAATAAAACTAACAAAGCCGTGGAGGGTAGTAGAATCGAAGGTAT 

Acil 

Dpnl Fnu4HI 
Hpyl78III Sau3AI | Taul 

BslI | MboII | j CviJI | Bsbl 

II III Ml 

GTCGGGATTATGGTTGCGATCTTCCAAGCCGCAACACAAA 

961 + + + + 1000 

CAGCCCTAATACCAACGCTAGAAGGTTCGGCGTTGTGTTT 
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Figure 5: 

tagctttata caaagtatag aaaaataaca cgacaataaa aggagcggtg ttttctcttc 60 

tgaggtaaat cagcctcaaa gatactacgc catagtaaag atg aag ttt ttt age 115 

Met Lys Phe Phe Ser 



tta att ttt aaa gat gat gat gtc tec cca aat aag aag gtt tta tct 
Leu lie Phe Lys Asp Asp Asp Val Ser Pro Asn Lys Lys Val Leu Ser 



cct gaa get ttc tct get ttc ctt gat gec aaa gag ctg tta gaa aaa 
Pro Glu Ala Phe Ser Ala Phe Leu Asp Ala Lys Glu Leu Leu Glu Lys 



aca aaa gec gat age gaa gec tat gtt gca gag aca gaa caa aag tgt 
Thr Lys Ala Asp Ser Glu Ala Tyr Val Ala Glu Thr Glu Gin Lys Cys 



gca caa att cgt caa gaa get aaa gat caa gga ttt aaa gag gga tct 
Ala Gin He Arg Gin Glu Ala Lys Asp Gin Gly Phe Lys Glu Gly Ser 



gaa tec tgg age aag caa att get ttc tta gaa gaa gaa act aaa aat 
Glu Ser Trp Ser Lys Gin He Ala Phe Leu Glu Glu Glu Thr Lys Asn 



eta cgc ata aga gta cgc gag gec ttg gtt cct ctg gca att gcg agt 

Leu Arg He Arg Val Arg Glu Ala Leu Val Pro Leu Ala He Ala Ser 
90 95 100 

gtg agg aaa ate att ggg aag gaa etc gaa tta cat cct gaa act att 

Val Arg Lys He He Gly Lys Glu Leu Glu Leu His Pro Glu Thr He 
105 110 115 

gtc tct att att tct caa gca ttg aaa gag etc aca caa aat aaa cat 

Val Ser He He Ser Gin Ala Leu Lys Glu Leu Thr Gin Asn Lys His 
120 125 130 

ate att ate tct gtc aat ccc aaa gat tta cct ctt gtt gag aaa agt 

He He He Ser Val Asn Pro Lys Asp Leu Pro Leu Val Glu Lys Ser 
135 140 145 

cgt cct gaa etc aag aac ate gtg gag tat get gac tec tta att ctt 

Arg Pro Glu Leu Lys Asn He Val Glu Tyr Ala Asp Ser Leu He Leu 

150 155 160 165 

aca gca aaa cct gat gtt act cct ggg ggt tgc att ate gag act gaa 
Thr Ala Lys Pro Asp Val Thr Pro Gly Gly Cys He He Glu Thr Glu 
170 175 180 

gca ggg ate ate aat gcg cag ctt gat gta caa tta gat gec tta gaa 
Ala Gly He He Asn Ala Gin Leu Asp Val Gin Leu Asp Ala Leu Glu 
185 190 195 
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Fig. 5 (con't) 

aaa get ttc teg act ata eta aaa gcg aag aac cct gta gac gag cca 739 

Lys Ala Phe Ser Thr He Leu Lys Ala Lys Asn Pro Val Asp Glu Pro 
200 205 210 

tct gag act tea tea tec acg gat tct tct tct tta tct aat gat cag 787 
Ser Glu Thr Ser Ser Ser Thr Asp Ser Ser Ser Leu Ser Asn Asp Gin 
215 220 225 

gat aag aaa gaa taaaggtatt cactattatg cgatccattt ttcgattttc 83 9 

Asp Lys Lys Glu 

230 

cctttgtttt tttacgctga gcgtctcatg ctgatttget gacgccagtc tatatgaaaa 899 
c 900 
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Figure 6 (RY-35) 

Restriction analysis of CPN1 00422 

BseMII 
MboII | Ddel 

Alul Acil | | Hpyl88IX 

CviJI BsrBI | I Mnl I | 

| I I I I I 

TAGCTTTATACAAAGTATAGAAAAATAACACGACAATAAAAGGAGCGGTGTTTTCTCTTC 

! + + + + + + 60 

ATCGAAATATGTTTCATATCTTTTTATTGTGCTGTTATTTTCCTCGCCACAAAAGAGAAG 

Tsp509I 
Msel | 
Alul | | 

Earl CviJI Mnll CviJI j j 

I I I I M 

TGAGGTAAATCAGCCTCAAAGATACTACGCCATAGTAAAGATGAAGTTTTTTAGCTTAAT 

61 + + + + + + 120 

ACTCCATTTAGTCGGAGTTTCTATGATGCGGTATCATTTCTACTTCAAAAAATCGAATTA 

Alul 
CviJI 

Dral Hindlll | SfaNI 

Msel I Hin4I BsmAI Hpyl78III | | Mwol | 

II II 

TTTTAAAGATGATGATGTCTCCCCAAATAAGAAGGTTTTATCTCCTGAAGCTTTCTCTGC 

12 i + + + + + + 180 

AAAATTTCTACTACTACAGAGGGGTTTATTCTTCCAAAATAGAGGACTTCGAAAGAGACG 

CviRI 

Eco57I Alul BsmAI | 

Acellll CviJI CviJI CviJI Mwol j 

II I I III 

TTTCCTTGATGCCAAAGAGCTGTTAGAAAAAACAAAAGCCGATAGCGAAGCCTATGTTGC 

181 + + + + + + 240 

AAAGGAACTACGGTTTCTCGACAATCTTTTTTGTTTTCGGCTATCGCTTCGGATACAACG 

Apol 
Tsp509I 

BsiHKAI | Dpnl 

Bspl286I j Sau3AI | 

BseSI | j Alul | j Dral 

CviRI j j CviJI j j Msel | 

ApaLI | j |Hpyl78III | | j Mnll || 

I I I I I I M Ml 

AGAGACAGAACAAAAGTGTGCACAAATTCGTCAAGAAGCTAAAGATCAAGGATTTAAAGA 

241 + + + + + + 300 

TCTCTGTCTTGTTTTCACACGTGTTTAAGCAGTTCTTCGATTTCTAGTTCCTAAATTTCT 
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Fig. 6 (con't) 



ScrFI 
EcoRII I 
Alwl | | 
Hinf I 
Tfil 
BsaBI | 



Hpyl8 8IX| 
Dpnl | I 
BstYI | I j 
Sau3AI | j j 
I I 



Tsp509I 
Cac8I | 
I I 



Tthlllll 
Bpml | 
Ddel j 



MboII 
MboII | 
I I 



GGGATCTGAATCCTGGAGCAAGCAAATTGCTTTCTTAGAAGAAGAAACTAAAAATCTACG 

301 + + + + + + 360 

CCCTAGACTTAGGACCTCGTTCGTTTAACGAAAGAATCTTCTTCTTTGATTTTTAGATGC 



BsaJI 
Styl 
CviJI | 
Hael j 
Haelll | 
Thai | | 
Rsal | j I NlalV 



Mnll | | Stul|DrdIl| Tsp509I 



CjePI 
Mnll 
Mnll | 
Muni | j 



I 



III II II I I II 

CATAAGAGTACGCGAGGCCTTGGTTCCTCTGGCAATTGCGAGTGTGAGGAAAATCATTGG 

361 + + + + + + 420 

GTATTCTCATGCGCTCCGGAACCAAGGAGACCGTTAACGCTCACACTCCTTTTAGTAACC 



Hpyl78III 
CjePI | 
Tsp509I | j 
Fokl TaqI | j j 
III I 



Bce83I BsmAI 
I I 



GAAGGAACTCGAATTACATCCTGAAACTATTGTCTCTATTATTTCTCAAGCATTGAAAGA 
CTTCCTTGAGCTTAATGTAGGACTTTGATAACAGAGATAATAAAGAGTTCGTAACTTTCT 



Banll 
BsiHKAI 
Bspl286I 
Sad 
Tthlllll 
Alul | 

CviJI | Mnll 

I I I 
GCTCACACAAAATAAACATATCATTATCTCTGTCAATCCCAAAGATTTACCTCTTGTTGA 

481 + + + + + + 540 

CGAGTGTGTTTTATTTGTATAGTAATAGAGACAGTTAGGGTTTCTAAATGGAGAACAACT 



25/165 



SUBSTITUTE SHEET (RULE 26) 



WO 00/24765 



PCT/CA99/00992 



Fig. 6 (con't) 



Hpyl78III Tsp509I 

Bce83I Hpyl78III Smll | Plel Hinfl Msel | 

I I I I I I M 

GAAAAGTCGTCCTGAACTCAAGAACATCGTGGAGTATGCTGACTCCTTAATTCTTACAGC 



541 



CTTTTCAGCAGGACTTGAGTTCTTGTAGCACCTCATACGACTGAGGAATTAAGAATGTCG 



ScrFI 

BsaJI Hpyl78III Eco57I 

EcoRII | BsmAI | Dpnl Fspl 

Maelll | | CviRI | TaqI j Sau3AI | Alwl | 

III I I I I I I I I 

AAAACCTGATGTTACTCCTGGGGGTTGCATTATCGAGACTGAAGCAGGGATCATCAATGC 

TTTTGGACTACAATGAGGACCCCCAACGTAATAGCTCTGACTTCGTCCCTAGTAGTTACG 



Tsp509I 

Alul Bbvl | 

CviJI Rsal 

Fnu4HI | BsrGI | 

Hhal| | SfaNI j 

Tselj | TatI j 

III II 



Hpyl78III 
Alul | 
CviJI | 
Ddel Hindlll | TaqI 
I III 



GCAGCTTGATGTACAATTAGATGCCTTAGAAAAAGCTTTCTCGACTATACTAAAAGCGAA 
CGTCGAACTACATGTTAATCTACGGAATCTTTTTCGAAAGAGCTGATATGATTTTCGCTT 



Ddel 
Hpyl88IX 
Fokl 
Bed 
BsmAI 
CviJI 
HaelV 
Hin4I 
BseMII | 
MboII | 
Accl| j 
Sfcl | | | 
II I 



Hinfl 
Tfil 
MboII 
BsaJI | 
BstDSI j 
MboII j 
I I 



GAACCCTGTAGACGAGCCATCTGAGACTTCATCATCCACGGATTCTTCTTCTTTATCTAA 

721 + + + + + + 780 

CTTGGGACATCTGCTCGGTAGACTCTGAAGTAGTAGGTGCCTAAGAAGAAGAAATAGATT 



Hpyl78III 
Dpnl | 
Bell | | 
Sau3AI | | 
I I 



Dpnl 
Sau3AI | 
Alwl | | 

Msll| I I 

II I I 



781 



TGATCAGGATAAGAAAGAATAAAGGTATTCACTATTATGCGATCCATTTTTCGATTTTCC 

+ + + + + + ! 

ACTAGTCCTATTCTTTCTTATTTCCATAAGTGATAATACGCTAGGTAAAAAGCTAAAAGG 
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Fig. 6 (con't) 



Bpull02I 
Hgal | 
BseMII | Ddel 
I I I 



Nlalll 
BsmAI | 
BsmBI j 
Mwol | | 

II I 



BsrI 
PshAI 
BsaHI | Hgal 

■ I I 



CTTTGTTTTTTTACGCTGAGCGTCTCATGCTGATTTGCTGACGCCAGTCTATATGAAAAC 



GAAACAAAAAAATGCGACTCGCAGAGTACGACTAAACGACTGCGGTCAGATATACTTTTG 
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Figure 7: CPN 100424 

tgttcgcgat tggcactaat cccccctttt gttatggtga ataaaaaggt atgcgtggat 60 

tatggttcgt cgatctattt ctttttgctt gttctttcta atg aca ttg ctg tgc 115 

Met Thr Leu Leu Cys 



tgt aca age tgt aac age agg tct eta att gtg cac ggt ctt cct ggc 
Cys Thr Ser Cys Asn Ser Arg Ser Leu lie Val His Gly Leu Pro Gly 



aga gaa gcg aat gag att gtg gtg ctt ttg gta age aaa ggg gtg get 
Arg Glu Ala Asn Glu lie Val Val Leu Leu Val Ser Lys Gly Val Ala 



gca caa aaa ttg cct caa get gca gcg get aca gec gga gca get act 
Ala Gin Lys Leu Pro Gin Ala Ala Ala Ala Thr Ala Gly Ala Ala Thr 



gag caa atg tgg gat ate gcg gtt ccg tea gca caa ate aca gag gee 
Glu Gin Met Trp Asp lie Ala Val Pro Ser Ala Gin lie Thr Glu Ala 



ctt gec att eta aat caa gcg ggt ctt cca cgt atg aaa ggg aca age 
Leu Ala He Leu Asn Gin Ala Gly Leu Pro Arg Met Lys Gly Thr Ser 



ctg tta gat ctt ttt gca aaa caa ggt ctt gtt cct tec gag ctt cag 
Leu Leu Asp Leu Phe Ala Lys Gin Gly Leu Val Pro Ser Glu Leu Gin 
90 95 100 

gaa aaa ate cgt tat caa gaa ggc tta tea gaa cag atg gec tct acg 
Glu Lys He Arg Tyr Gin Glu Gly Leu Ser Glu Gin Met Ala Ser Thr 
105 110 115 

att aga aaa atg gat ggc gtt gtc gat gec tea gta cag att tec ttc 
He Arg Lys Met Asp Gly Val Val Asp Ala Ser Val Gin He Ser Phe 
120 125 130 

act aca gaa aat gaa gat aat ctt cct tta aca gec tct gtg tat att 
Thr Thr Glu Asn Glu Asp Asn Leu Pro Leu Thr Ala Ser Val Tyr He 
135 140 145 

aag cat cga ggg gtt ttg gac aat ccg aac age att atg gtt tec aaa 
Lys His Arg Gly Val Leu Asp Asn Pro Asn Ser He Met Val Ser Lys 
150 155 160 165 

att aag cgc ctt att gca agt get gtt cca gga ctt gtg cca gag aac 
He Lys Arg Leu He Ala Ser Ala Val Pro Gly Leu Val Pro Glu Asn 
170 175 180 

gtc tct gta gtg age gat cgc gca get tat agt gat att aca att aat 
Val Ser Val Val Ser Asp Arg Ala Ala Tyr Ser Asp He Thr He Asn 
185 190 195 
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Fig. 7 (con't) 

ggt cct tgg gga tta aca gaa gaa ate gat tat gtt tct gtt tgg ggt 73 9 

Gly Pro Trp Gly Leu Thr Glu Glu He Asp Tyr Val Ser Val Trp Gly 
200 205 210 

att att ctt gcg aag tct teg etc acc aaa ttc cgt etc att ttt tat 787 
He He Leu Ala Lys Ser Ser Leu Thr Lys Phe Arg Leu He Phe Tyr 
215 220 225 

gtc ttg att etc att tta ttt gtt att tct tgt ggt etc ctt tgg gtc 835 
Val Leu He Leu He Leu Phe Val He Ser Cys Gly Leu Leu Trp Val 
230 235 240 245 

att tgg aaa act cat act etc att atg act atg gga ggt aca aaa ggg 883 
He Trp Lys Thr His Thr Leu He Met Thr Met Gly Gly Thr Lys Gly 
250 255 260 

ttc ttc aac cct aca cca tat aca aag aat gec ttg gaa gec aag aaa 931 
Phe Phe Asn Pro Thr Pro Tyr Thr Lys Asn Ala Leu Glu Ala Lys Lys 
265 270 275 

gec gag gga gca get get gac aaa gag aaa aaa gaa gat gca gat tea 97 9 
Ala Glu Gly Ala Ala Ala Asp Lys Glu Lys Lys Glu Asp Ala Asp Ser 
280 285 290 

cag ggg gaa age aaa aat gcg gaa acc agt gat aaa gac tct agt gat 1027 
Gin Gly Glu Ser Lys Asn Ala Glu Thr Ser Asp Lys Asp Ser Ser Asp 
295 300 305 

aaa gat get cca gaa gga age aat gaa att gag ggt get tagtgactgc 1076 
Lys Asp Ala Pro Glu Gly Ser Asn Glu He Glu Gly Ala 
310 315 320 

caacactttt ggaactctag acatcttgat gaagcactcc aaggaagatg acctctccag 1136 

gtttcttcct aaaaatcttc ttgttgaatc tcctcatccc gaagaaatcc ctttaaaatc 1196 

ttta 1200 
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Figure 8 (RY-36) 

Restriction analysis of CPN100424 

Hpyl78III 
Nrul 

Thai HphI 

I I 
TGTTCGCGATTGGCACTAATCCCCCCTTTTGTTATGGTGAATAAAAAGGTATGCGTGGAT 

1 + + + + + + ( 

ACAAGCGCTAACCGTGATTAGGGGGGAAAACAATACCACTTATTTTTCCATACGCACCTA 



Tthlllll 
Dpnl | 
Sau3AI | j 
Drdll TaqI | j j 

I II I I 



Mwol 
Rsal | 
BsrGI | | 
BsrDI MslI TatI j | 

I I III 



TATGGTTCGTCGATCTATTTCTTTTTGCTTGTTCTTTCTAATGACATTGCTGTGCTGTAC 
ATACCAAGCAGCTAGATAAAGAAAAACGAACAAGAAAGATTACTGTAACGACACGACATG 



Mae III 
Alul | 
BspMI j 
Cvi JI j Mwol 
I I I 



ScrFI 
EcoRII 
Taal 
BsiHKAI 
Bspl286I 
BseSI 



CviRI 
ApaLI | 

Bbsl 
MboII 
Bsal | 
BsmAI | 
Tsp509I | 
I I 



AAGCTGTAACAGCAGGTCTCTAATTGTGCACGGTCTTCCTGGCAGAGAAGCGAATGAGAT 



TTCGACATTGTCGTCCAGAGATTAACACGTGCCAGAAGGACCGTCTCTTCGCTTACTCTA 
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Fig. 8 (con't) 



Bbvl 



II 



Tsp509I 
CviRI | 
Bce83I 



Fnu4HI | 
Cvi JI | | 
Tsel | | Bbvl 

II I I I I II 

TGTGGTGCTTTTGGTAAGCAAAGGGGTGGCTGCACAAAAATTGCCTCAAGCTGCAGCGGC 

181 + + + + + + 240 

ACACCACGAAAACCATTCGTTTCCCCACCGACGTGTTTTTAACGGAGTTCGACGTCGCCG 



Sfcl 
Cvi JI | 



Fnu4HI 
Sfcl 
Alul 
CviJI 
Smll Tsel 
I I 



Fnu4HI 
Taul 
Acil 
MspAlI 
Mwol 
PstI 
Fnu4HI 
Mnll 
CviRI | 
Tsel | 



Mwol 
Tsel 
BseMII 
Mspl 
Bbvl | 
CviJI | 
Mwol | | 
I II 



Ddel 
AlwNI 
RleAI 
Alul 
CviJI 
Fnu4HI 
Cj 



CjePI 
NlalV 
Acil | 
Thai I 
EcoRV | | 
I I 



TACAGCCGGAGCAGCTACTGAGCAAATGTGGGATATCGCGGTTCCGTCAGCACAAATCAC 
ATGTCGGCCTCGTCGATGACTCGTTTACACCCTATAGCGCCAAGGCAGTCGTGTTTAGTG 



CviJI 
Haelll 
CjePI | 
Eco0109I j 
Sau96I | 
II 



SimI 
Acil 
Bbsl 
MboII 
Faul | 
Sthl32l| | 
II I 



BsaAI 
CjePI | 
Maell | 

II 



CviJI 
I 



301 



AGAGGCCCTTGCCATTCTAAATCAAGCGGGTCTTCCACGTATGAAAGGGACAAGCCTGTT 
TCTCCGGGAACGGTAAGATTTAGTTCGCCCAGAAGGTGCATACTTTCCCTGTTCGGACAA 
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Fig. 8 (con't) 

Dpnl 

Bglll | Hpyl78III 
BsmFI j Alul | 

BstYI j CviJI j 

Sau3AI | CviRI Eco57I Hpyl88IX | | Hpyl78III 

III I III I 

AGATCTTTTTGCAAAACAAGGTCTTGTTCCTTCCGAGCTTCAGGAAAAAATCCGTTATCA 

361 + + + + + + 420 

TCTAGAAAAACGTTTTGTTCCAGAACAAGGAAGGCTCGAAGTCCTTTTTTAGGCAATAGT 

Pf 111081 
CviJI 
Hael 

Hpyl88IX Haelll 



CviJI | Bed 

II II 



Bed Fokl 
Mnll SfaNI TaqI | 

I III 



AGAAGGCTTATCAGAACAGATGGCCTCTACGATTAGAAAAATGGATGGCGTTGTCGATGC 

421 + + + + + + 480 

TCTTCCGAATAGTCTTGTCTACCGGAGATGCTAATCTTTTTACCTACCGCAACAGCTACG 

Rsal 

TatI I BseMII Msel AlwNI 

Ddel | j Mnll | Sfcl MboII MboII | CviJI | 

I I I I I I I I I I I 

CTCAGTACAGATTTCCTTCACTACAGAAAATGAAGATAATCTTCCTTTAACAGCCTCTGT 

481 + + + + + + 540 

GAGTCATGTCTAAAGGAAGTGATGTCTTTTACTTCTATTAGAAGGAAATTGTCGGAGACA 

Mnll 
CjePI | 

Msel j Msel 
Mnll | | TaqI SfaNI Hpyl88IX CjePI Tsp509I | 

I I I I I I I I I 

GTATATTAAGCATCGAGGGGTTTTGGACAATCCGAACAGCATTATGGTTTCCAAAATTAA 

541 + + + + + + 600 

CATATAATTCGTAGCTCCCCAAAACCTGTTAGGCTTGTCGTAATACCAAAGGTTTTAATT 

Bsp24I Cjel 

CjePI BsmAI CjePI 

Cjel I BsmBI Dpnl 

Haell Cjel ScrFI || Hin4I Cjel| Bsp24l| 

Hhal| CviRI | EcoRII | Maell Sf cl j Sau3AI | | 



GCGCCTTATTGCAAGTGCTGTTCCAGGACTTGTGCCAGAGAACGTCTCTGTAGTGAGCGA 

601 + + + + + + 660 

CGCGGAATAACGTTCACGACAAGGTCCTGAACACGGTCTCTTGCAGAGACATCACTCGCT 
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Fig. 8 (con't) 



Alul 
CviJI 
Fnu4HI 

Hhal | BsaJI 

Tselj Styl 

Thai | j Avail | 

BplI | j j Sau96I j 

BsiEI MM Msel | | 

Pvul III Bbvl Vspl j j 

Sgfl Ml Cjel |Tsp509I | j j 

I I II I II II I ' 



Cje 
Msel 

I 



Tthlllll 
Clal | 
Taql | 

I ' 



TCGCGCAGCTTATAGTGATATTACAATTAATGGTCCTTGGGGATTAACAGAAGAAATCGA 

661 + + + + + + 720 

AGCGCGTCGAATATCACTATAATGTTAATTACCAGGAACCCCTAATTGTCTTCTTTAGCT 



Bsp24I 
CjePI 

Bbsl Cjel | Apol BsmAI 

MboII MboII HphI | j Tsp509I BsmBI 

I I I II I I 

TTATGTTTCTGTTTGGGGTATTATTCTTGCGAAGTCTTCGCTCACCAAATTCCGTCTCAT 

721 + + + + + + 780 

AATACAAAGACAAACCCCATAATAAGAACGCTTCAGAAGCGAGTGGTTTAAGGCAGAGTA 



Hinfl 
Tfil 
Hpyl78III 
Cjel | 

CjePI | | Bsal 
Bsp24l|| j BsmAI SimI 

III I II 
TTTTTATGTCTTGATTCTCATTTTATTTGTTATTTCTTGTGGTCTCCTTTGGGTCATTTG 

781 + + + + + + 840 

AAAAATACAGAACTAAGAGTAAAATAAACAATAAAGAACACCAGAGGAAACCCAGTAAAC 

MboII 

Mnll Rsal | 



GAAAACTCATACTCTCATTATGACTATGGGAGGTACAAAAGGGTTCTTCAACCCTACACC 
CTTTTGAGTATGAGAGTAATACTGATACCCTCCATGTTTTCCCAAGAAGTTGGGATGTGG 
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Fig. 8 (con't) 



BsaJI 
Cvi JI | 

BsmI Bbvl | I 

BsaJI | CviJI Mwol | j | 
Styl j Mwol | Mnll | III 
II I I II II' 



AlwNI 
Fnu4HI 
Alul | 
CviJI j 
MspAlI j 
PvuII j 
Tsel j 
Fnu4HI | j 
Tsel | | | 
Mwol || || 
II II 



ATATACAAAGAATGCCTTGGAAGCCAAGAAAGCCGAGGGAGCAGCTGCTGACAAAGAGAA 



TATATGTTTCTTACGGAACCTTCGGTTCTTTCGGCTCCCTCGTCGACGACTGTTTCTCTT 

MboII Hinfl 
Hinfl | Hin4I | 

Tfil | TspRI | |BfaI 

CviRI | Acil BsrI Plel | | | Bpml 

I I I I I I I I I I 

AAAAGAAGATGCAGATTCACAGGGGGAAAGCAAAAATGCGGAAACCAGTGATAAAGACTC 

TTTTCTTCTACGTCTAAGTGTCCCCCTTTCGTTTTTACGCCTTTGGTCACTATTTCTGAG 

BsrDI Maelll 
Tsp509l| Tsp45I 
iNI Hpyl78III Mnll | j Ddel | Bsbl 

| I III II I 

TAGTGATAAAGATGCTCCAGAAGGAAGCAATGAAATTGAGGGTGCTTAGTGACTGCCAAC 

ATCACTATTTCTACGAGGTCTTCCTTCGTTACTTTAACTCCCACGAATCACTGACGGTTG 



Hpyl78III 
Hpyl78III | 
Bfal| j 
Xbal | | MslI 
III I 



Bpml 
BsaJI | 
Styl | 
II 



EcoRII 
HgiEII 
Hin4I | 
CjePI | MboII 

I I I 



ACTTTTGGAACTCTAGACATCTTGATGAAGCACTCCAAGGAAGATGACCTCTCCAGGTTT 



TGAAAACCTTGAGATCTGTAGAACTACTTCGTGAGGTTCCTTCTACTGGAGAGGTCCAAA 



MboII 
I 



Hinfl 
CjePI | 
BseRI | | 
Fokl j Tfil 
II I 



Sthl32I 
Mnll | 
Hpyl78III | j 
I ' 



Dral 
MboII | 
Msel j 



CTTCCTAAAAATCTTCTTGTTGAATCTCCTCATCCCGAAGAAATCCCTTTAAAATCTTTA 

1141 + + + + + + 1200 

GAAGGATTTTTAGAAGAACAACTTAGAGGAGTAGGGCTTCTTTAGGGAAATTTTAGAAAT 
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Figure 9: CPN100426 

ttgaacccta tggaaatgta tcttatttgt gctgggctat atttcttaat gacaacatca 60 

ttttcctgta tttctaggtt atcagaaaag agaaggagtt atg aca att aga gtc 115 

Met Thr lie Arg Val 



cga aac ctt gcc tac tct gta aat aag aaa aag att eta gat ggt gta 
Arg Asn Leu Ala Tyr Ser Val Asn Lys Lys Lys He Leu Asp Gly Val 



act ttt tct tta gag cga ggg cac att aca ctg ttt gtt ggg aag agt 
Thr Phe Ser Leu Glu Arg Gly His He Thr Leu Phe Val Gly Lys Ser 



ggt tea gga aaa aca atg att tta cgt get ttg gcg ggc tta gtc cag 
Gly Ser Gly Lys Thr Met He Leu Arg Ala Leu Ala Gly Leu Val Gin 



ccc act caa gga gat att tgg att gaa ggg gag get cca get eta gtt 
Pro Thr Gin Gly Asp He Trp He Glu Gly Glu Ala Pro Ala Leu Val 



ttc caa caa ccc gag tta ttt tec cat atg aca gta tta gga aat tgc 
Phe Gin Gin Pro Glu Leu Phe Ser His Met Thr Val Leu Gly Asn Cys 



acc cat cca caa ate cat ate aag ggt cgt agt acc gaa gaa get cga 
Thr His Pro Gin He His He Lys Gly Arg Ser Thr Glu Glu Ala Arg 
90 95 100 

gaa aag gcg ttc gag ctt tta cat ttg ttg gat att gaa gag gtt get 
Glu Lys Ala Phe Glu Leu Leu His Leu Leu Asp He Glu Glu Val Ala 
105 HO 115 

aag aat tat cct gac cag etc tct ggg gga caa aaa caa cgt gtg get 
Lys Asn Tyr Pro Asp Gin Leu Ser Gly Gly Gin Lys Gin Arg Val Ala 
120 125 130 

att gta cgt tct tta tgt atg gat aaa cat aca tta ctt ttt gat gaa 
He Val Arg Ser Leu Cys Met Asp Lys His Thr Leu Leu Phe Asp Glu 
135 140 145 

cct aca teg get tta gat cct ttt get acg gca teg ttc cga cat ctt 
Pro Thr Ser Ala Leu Asp Pro Phe Ala Thr Ala Ser Phe Arg His Leu 
150 155 160 165 

tta gaa aca ctt cga gac cag gaa ctg act gta ggg tta act act cat 
Leu Glu Thr Leu Arg Asp Gin Glu Leu Thr Val Gly Leu Thr Thr His 
170 175 180 

gac atg caa ttt gtt cat agt tgt ttg gat cgt ate tat ctt ata gat 
Asp Met Gin Phe Val His Ser Cys Leu Asp Arg He Tyr Leu He Asp 
185 190 195 
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Fig. 9 (con't) 

caa gga act gtt gcg ggg gtc tat gac aag cgt gac gga gag etc gat 739 

Gin Gly Thr Val Ala Gly Val Tyr Asp Lys Arg Asp Gly Glu Leu Asp 
200 205 210 

tct ggt cat cca tta teg aaa tat ate cac tct get caa taggactaca 788 
Ser Gly His Pro Leu Ser Lys Tyr lie His Ser Ala Gin 
215 220 225 

getgetagag cagctgtagt gatactttag aatcctgacc agtggcagga atgageggea 848 
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Figure 10 (RY-37) 

Restriction enzyme analysis of CPN100426 



CviJI Msel 

I I 

TTGAACCCTATGGAAATGTATCTTATTTGTGCTGGGCTATATTTCTTAATGACAACATCA 

1 + + + + + + 60 

AACTTGGGATACCTTTACATAGAATAAACACGACCCGATATAAAGAATTACTGTTGTAGT 

Hpyl8 8IX 

Bfal Hpyl88IX Tsp509I Hinfl | Plel 

I I I I I I 

TTTTCCTGTATTTCTAGGTTATCAGAAAAGAGAAGGAGTTATGACAATTAGAGTCCGAAA 

6i + + + + + + 120 

AAAAGGACATAAAGATCCAATAGTCTTTTCTCTTCCTCAATACTGTTAATCTCAGGCTTT 

Hpyl78III 
Bfal | 
Xbal | 
Hinfl | 
CjePI Tfil j 

I II 

CCTTGCCTACTCTGTAAATAAGAAAAAGATTCTAGATGGTGTAACTTTTTCTTTAGAGCG 

12 i + + + + + + 180 

GGAACGGATGAGACATTTATTCTTTTTCTAAGATCTACCACATTGAAAAAGAAATCTCGC 



| Maelll Mnll 
| Bed | CjePI | 

III M 



Bspl286I Hpyl78III Faul 

Tthlllll| Earl MboII Sthl32l| 

Bmgl | | TspRI | Drdll | BsaAI | j 

BseSlM Taal | | Alol | | Cjel Maell | | | 

Ill I I I I I 

AGGGCACATTACACTGTTTGTTGGGAAGAGTGGTTCAGGAAAAACAATGATTTTACGTGC 

181 + + + + + + 240 

TCCCGTGTAATGTGACAAACAACCCTTCTCACCAAGTCCTTTTTGTTACTAAAATGCACG 

Ddel 
Bce83I | 
CviJI j Cjel 
Cac8I | | CviJI BslI Mnll NlalV Alul 

Acil | || BspGI | Smll | Bpml |CjeI Cvi JI | CviJI 

I I II I I I I III Ml 

TTTGGCGGGCTTAGTCCAGCCCACTCAAGGAGATATTTGGATTGAAGGGGAGGCTCCAGC 

241 + + + + + + 300 

AAACCGCCCGAATCAGGTCGGGTGAGTTCCTCTATAAACCTAACTTCCCCTCCGAGGTCG 

Sthl32I 

Cjel | Mmel Tsp509I 

Bfal Acelll Aval | j Ndel | Taal Fokl | CviRI Bed 

I I III II I I I I I 

TCTAGTTTTCCAACAACCCGAGTTATTTTCCCATATGACAGTATTAGGAAATTGCACCCA 

301 + + + + + + 360 

AGATCAAAAGGTTGTTGGGCTCAATAAAAGGGTATACTGTCATAATCCTTTAACGTGGGT 
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Fig. 10 (con't) 



Pf 111081 
SimI | 



Taqll BslI 

I 



|RsaI 
I 



MboII 
Hpyl78III 
TaqI 
Aval | 
Smll j 
Xhol j 
Alul | j 
Cvi JI | | 
III 



Alul 
CviJI 
Cjel | 
TaqI | 
Mmel | j 
I I I 



TCCACAAATCCATATCAAGGGTCGTAGTACCGAAGAAGCTCGAGAAAAGGCGTTCGAGCT 

361 + + + + + + 420 

AGGTGTTTAGGTATAGTTCCCAGCATCATGGCTTCTTCGAGCTCTTTTCCGCAAGCTCGA 



Mnll 
Earl | 
I I 



Hpyl78III 
Tsp509I 
MboII | 
Ddel | | 
Cjel | | | 
I I I I 



BstXI 
Alul | 
CviJI |AceIII 
Hin4l| | Cjel | 
II I I I 



TTTACATTTGTTGGATATTGAAGAGGTTGCTAAGAATTATCCTGACCAGCTCTCTGGGGG 



AAATGTAAACAACCTATAACTTCTCCAACGATTCTTAATAGGACTGGTCGAGAGACCCCC 
BsmFI 

AflHI | Maell 
Maell j CviJI Rsal| Cjel 

III II I 

ACAAAAACAACGTGTGGCTATTGTACGTTCTTTATGTATGGATAAACATACATTACTTTT 

TGTTTTTGTTGCACACCGATAACATGCAAGAAATACATACCTATTTGTATGTAATGAAAA 



Dpnl 
BstYI | 

Sau3AI j Bcefl 
Alwl | | SfaNI | 

CviJI | | Hpyl8 8IX| j 

III III 
TGATGAACCTACATCGGCTTTAGATCCTTTTGCTACGGCATCGTTCCGACATCTTTTAGA 

541 + + + + + + 600 

ACTACTTGGATGTAGCCGAAATCTAGGAAAACGATGCCGTAGCAAGGCTGTAGAAAATCT 



Tthlllll 



ScrFI 
EcoRII 
Mmel 
Hpyl78III 
TaqI 
Bsal | 
BsmAI j 
XmnI j 



Taal 



Hindi 
Hpal 
Msel | 



Tsp509I 
CviRI ' 
Nlalll 
Nlalll | 
Hpyl78III | 



I 
I 

Nspl 



|AlwNI Sfcl| Msel| Real | 

I I II II II Ml 

AACACTTCGAGACCAGGAACTGACTGTAGGGTTAACTACTCATGACATGCAATTTGTTCA 

601 + + + + + + 660 

TTGTGAAGCTCTGGTCCTTGACTGACATCCCAATTGATGAGTACTGTACGTTAAACAAGT 
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Fig. 10 (con't) 



Taal 
Faul | 
Sthl32l| | 

Dpnl Dpnl | j j 

Sau3AI | Alwl Sau3AI | | | |AciI SimI 

II I II II I I I 

TAGTTGTTTGGATCGTATCTATCTTATAGATCAAGGAACTGTTGCGGGGGTCTATGACAA 

661 + + + + + + 720 

ATCAACAAACCTAGCATAGATAGAATATCTAGTTCCTTGACAACGCCCCCAGATACTGTT 



Mae I I I 
Tsp45I 



Hinf I 
Tfil 
Banll 
BsiHKAI 
Bspl286I 
SacI 
TaqI 
Alul | 
CviJI | 
Fokl | | 
II I 



TaqI Bbvl 

I II I I I I 

GCGTGACGGAGAGCTCGATTCTGGTCATCCATTATCGAAATATATCCACTCTGCTCAATA 

721 + + + + + + 780 

CGCACTGCCTCTCGAGCTAAGACCAGTAGGTAATAGCTTTATATAGGTGAGACGAGTTAT 



Sfcl 
Alul | 
CviJI j 
MspAlI j 
PvuII j 
Fnu4HI 
Mwol | 
Tsel | 
Mwol 
Bfal 



Fnu4HI 
Alul | 
CviJI j 
MspAlI | 
PvuII j 
Sfcl Tsel j 
I II 



Bbvl 
I 



Hpyl78III 

Hinfl | TspRI 

Tfil |BsrI Bsll| 

III II 



GGACTACAGCTGCTAGAGCAGCTGTAGTGATACTTTAGAATCCTGACCAGTGGCAGGAAT 



CCTGATGTCGACGATCTCGTCGACATCACTATGAAATCTTAGGACTGGTCACCGTCCTTA 
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Fig. 10 (con't) 

Fnu4HI 
Taul 
Acil | 
BsrBI | Nlalll 

II I 
GAGCGGCATG 

841 + 850 

CTCGCCGTAC 
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Figure 11: CPN10O5O8 



ctctgattta tggtaattct ttattttcag agccgtcaag tcctttctat tctgttgaat 60 

ttcctaataa cgtaagtaat aaacaatcaa aagtccgcat atg aaa aga cct ttt 115 

Met Lys Arg Pro Phe 
1 5 

ttt acc tat eta tgc ate ate ttc tac gga tct tgt gca teg tta tct 163 
Phe Thr Tyr Leu Cys lie lie Phe Tyr Gly Ser Cys Ala Ser Leu Ser 
10 15 20 

tta cat gca gga etc tct ttc cca gaa gta cgt gga get acg get get 211 
Leu His Ala Gly Leu Ser Phe Pro Glu Val Arg Gly Ala Thr Ala Ala 
25 30 35 



gtt gtc cat gee gac tct ggg aag gta ttc tat gat aaa gac ata gat 25 9 

Val Val His Ala Asp Ser Gly Lys Val Phe Tyr Asp Lys Asp He Asp 

Val Val His Ala Asp Ser Gly Lys Val Phe Tyr Asp Lys Asp He Asp 
40 45 50 

get gta ate tat cct gee age atg acg aaa ate gca act gec etc ttt 307 

Ala Val He Tyr Pro Ala Ser Met Thr Lys He Ala Thr Ala Leu Phe 

Ala Val He Tyr Pro Ala Ser Met Thr Lys He Ala Thr Ala Leu Phe 

55 60 65 

ate eta aag cac tat ccc aca gtc etc gat act etc ate aaa gtc aaa 355 

He Leu Lys His Tyr Pro Thr Val Leu Asp Thr Leu He Lys Val Lys 

He Leu Lys His Tyr Pro Thr Val Leu Asd Thr Leu He Lys Val Lvs 

70 75 * 80 85 

caa gat gcg ate get tec ate act ccg caa gca aaa aaa caa tea gga 4 03 

Gin Asp Ala He Ala Ser He Thr Pro Gin Ala Lys Lys Gin Ser Gly 

Gin Asp Ala He Ala Ser He Thr Pro Gin Ala Lys Lys Gin Ser Gly 

90 95 100 

tat cgt agt cct ccc cac tgg tta gaa act gat gga tct aca ata cag 451 

Tyr Arg Ser Pro Pro His Trp Leu Glu Thr Asp Gly Ser Thr He Gin 

Tyr Arg Ser Pro Pro His Trp Leu Glu Thr Asp Gly Ser Thr He Gin 
105 110 H5 

etc cat ctt cga gaa gag ctt tta ggg tgg gac ctg ttc cac gec tta 4 99 

Leu His Leu Arg Glu Glu Leu Leu Gly Trp Asp Leu Phe His Ala Leu 

Leu His Leu Arg Glu Glu Leu Leu Gly Trp Asp Leu Phe His Ala Leu 

120 125 130 

ctg gtc tgt tct get aat gat get gcg aat gtc tta get atg gca tgt 547 

Leu Val Cys Ser Ala Asn Asp Ala Ala Asn Val Leu Ala Met Ala Cys 

Leu Val Cys Ser Ala Asn Asp Ala Ala Asn Val Leu Ala Met Ala Cys 
135 140 145 

tgc gga tct gta gag aag ttt atg gat aag ctg aac ttc ttc tta aaa 595 

Cys Gly Ser Val Glu Lys Phe Met Asp Lys Leu Asn Phe Phe Leu Lys 

Cys Gly Ser Val Glu Lys Phe Met Asp Lys Leu Asn Phe Phe Leu Lys 

150 155 160 165 

gaa gaa ate ggc tgc act cat acc cat ttt aat aat ccc cat ggg tta 643 

Glu Glu He Gly Cys Thr His Thr His Phe Asn Asn Pro His Gly Leu 

Glu Glu He Gly Cys Thr His Thr His Phe Asn Asn Pro His Gly Leu 

170 175 180 
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Fig. 1 1 (con't) 



cat cat ccg aat cac tat act aca acc cgt gat ctt att age at- atg 

His His Pro Asn His Tyr Thr Thr Thr Arg Asp Leu lie Ser He Met 

His His Pro Asn His Tyr Thr Thr Thr Arg Asd Leu He Ser He Met 

185 190 " 195 

cgt tgc get ctg aaa gaa cct cca ttt cga ggg gtc ate tec acg aca 

Arg Cys Ala Leu Lys Glu Pro Pro Phe Arg Gly Val He Ser Thr Thr 

Arg Cys Ala Leu Lys Glu Pro Pro Phe Arg Gly Val He Ser Thr Thr 

200 205 210 

age tat aaa ata ggg get aca aac ctg cat ggc gaa egg ate eta tec 

Ser Tyr Lys He Gly Ala Thr Asn Leu His Gly Glu Arg He Leu Ser 

Ser Tyr Lys He Gly Ala Thr Asn Leu His Gly Glu Arg He Leu Ser 

215 220 225 

cca aca aac aaa ttg ctt ctt cct ggg tct acc tac cac tat ccc cca 

Pro Thr Asn Lys Leu Leu Leu Pro Gly Ser Thr Tyr His Tyr Pro Pro 

Pro Thr Asn Lys Leu Leu Leu Pro Gly Ser Thr Tyr His Tyr Pro Pro 
230 235 240 245 

get tta gga ggg aaa aca ggg acc acc aag act gca ggg aaa aat eta 

Ala Leu Gly Gly Lys Thr Gly Thr Thr Lys Thr Ala Gly Lys Asn Leu 

Ala Leu Gly Glv Lys Thr Gly Thr Thr Lys Thr Ala Gly Lys Asn Leu 
250 255 " 260 

att atg get get gaa aaa aat aac cgc etc ttg gta acg ate gca acg 
He Met Ala Ala Glu Lys Asn Asn Arg Leu Leu Val Thr He Ala Thr 
He Met Ala Ala Glu Lys Asn Asn Arg Leu Leu Val Thr He Ala Thr 
265 270 275 

ggc tat teg ggt cct gtg agt gat etc tac caa gat gtc att get eta 

Gly Tyr Ser Gly Pro Val Ser Asp Leu Tyr Gin Asp Val He Ala Leu 

Gly Tvr Ser Glv Pro 

280 285 290 

tgt gaa acg gta ttt aac gag ccg eta tta aga aaa gag etc gtc ccc 
Cys Glu Thr Val Phe Asn Glu Pro Leu Leu Arg Lys Glu Leu Val Pro 
295 300 305 

ccc tec gac tgt etc caa tta gaa ata gcg aat ctt ggg aag ctt tct 
Pro Ser Asp Cys Leu Gin Leu Glu He Ala Asn Leu Gly Lys Leu Ser 
310 315 320 325 

tgc cct ctt cct gag gga etc tac tat gac ttc tat gec tec gaa gat 
Cys Pro Leu Pro Glu Gly Leu Tyr Tyr Asp Phe Tyr Ala Ser Glu Asp 
330 335 340 

cgc gaa cct ctt tct gta tct ttt att gca cat gcg gac gec ttc cct 
Arg Glu Pro Leu Ser Vai Ser Phe He Ala His Ala Asd Ala Phe Pro 
345 350 ' 355 

att gaa caa gga gat eft ct- ggt cat tgg gtt ttt tat gac gat gaa 
He Glu Gin Gly Asd Leu Leu Gly His TrD Val Phe Tyr Asd Asd Glu 
360 365 370 
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Fig. 11 (con't) 

ggc aag aaa att tct tec cag cct ttc tat gec cct tgt cgt ttt gag 1267 

Gly Lys Lys He Ser Ser Gin Pro Phe Tyr Ala Pro Cys Arg Phe Glu 

375 380 385 

cgc act ate aag cct tgg aaa etc tat atg aaa cgt gtc ttc aca teg 1315 
Arg Thr He Lys Pro Trp Lys Leu Tyr Met Lys Arg Val Phe Thr Ser 
390 395 400 405 

tat aga acc tat atg tct ata acc atg ctg etc atg tat ttt cgc ate 1363 
Tyr Arg Thr Tyr Met Ser He Thr Met Leu Leu Met Tyr Phe Arg He 
410 415 420 

cgc aag cac cgc aag tat aaa aat tta aaa cac tat tct aaa ate 1408 
Arg Lys His Arg Lys Tyr Lys Asn Leu Lys His Tyr Ser Lys He 
425 430 435 

taactttttc ttttaattta taaaaaacca aaggtttatg taagatttgc gcttttcaat 1468 

ccaacaagaa tcccttgtgc gcacattact tt 1500 
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Figure 12 (RY-39) 

Restriction enzyme analysis of CPN 100508 



Bcefl CviJI Apol 

Hpyl88IX Tsp509I | Hpyl88IX | Tsp509I 

I I I I I I 

CTCTGATTTATGGTAATTCTTTATTTTCAGAGCCGTCAAGTCCTTTCTATTCTGTTGAAT 

! + + + + + + < 

GAGACTAAATACCATTAAGAAATAAAAGTCTCGGCAGTTCAGGAAAGATAAGACAACTTA 

Ndel 

Maell Acil | 

I I I 

TTCCTAATAACGTAAGTAATAAACAATCAAAAGTCCGCATATGAAAAGACCTTTTTTTAC 

61 + + + + + + : 

AAGGATTATTGCATTCATTATTTGTTAGTTTTCAGGCGTATACTTTTCTGGAAAAAAATG 



CviRI | 
MboII | | 



Dpnl 
BstYI | 
Sau3AI j 
SfaNI | j 

I I ' 



CviRI 
Alwl | 

II 



Hinf I 
CviRI | 
Nlalll | 
Nspl | 
Plel | 
I I 



CTATCTATGCATCATCTTCTACGGATCTTGTGCATCGTTATCTTTACATGCAGGACTCTC 
GATAGATACGTAGTAGAAGATGCCTAGAACACGTAGCAATAGAAATGTACGTCCTGAGAG 



BsaAI 
Bbvl | 
Mae I I j 
Rsal | | 
II 



CjePI 
Fnu4HI 
CviJI | 
Alul Mwol | 
CviJI Tselj 
I II 



Hinfl 
Xcml 
Nlalll | 
Bcefl | | 

Plel | | |BslI XmnI 
II I I I 



TTTCCCAGAAGTACGTGGAGCTACGGCTGCTGTTGTCCATGCCGACTCTGGGAAGGTATT 

181 + + + + + + 240 

AAAGGGTCTTCATGCACCTCGATGCCGACGACAACAGGTACGGCTGAGACCCTTCCATAA 



SfaNI 
CjePI | 



Nlalll 
Cac8I I 

I I 

CTATGATAAAGACATAGATGCTGTAATCTATCCTGCCAGCATGACGAAAATCGCAACTGC 

241 + + + + + + 300 

GATACTATTTCTGTATCTACGACATTAGATAGGACGGTCGTACTGCTTTTAGCGTTGACG 

Mnll 

Mnll Taal TaqI RleAI | SfaNI 

I I I II I 

CCTCTTTATCCTAAAGCACTATCCCACAGTCCTCGATACTCTCATCAAAGTCAAACAAGA 

301 + + + + + + 360 

GGAGAAATAGGATTTCGTGATAGGGTGTCAGGAGCTATGAGAGTAGTTTCAGTTTGTTCT 
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Fig. 12 (con't) 



Tthlllll 
BsiEI 
Pvul 
Sgfl 
Dpnl | 
Sau3AI | | 
I I 



Cac8I 
Acil | 

I I 



Pflll08I 
EcoRV | 
Hpyl78III | | 
Tthlllll | | | 

II I I 



361 



TGCGATCGCTTCCATCACTCCGCAAGCAAAAAAACAATCAGGATATCGTAGTCCTCCCCA 
ACGCTAGCGAAGGTAGTGAGGCGTTCGTTTTTTTGTTAGTCCTATAGCATCAGGAGGGGT 



Hpyl78III 
Acelll | 

Dpnl Alul TaqI | j 

BsrI BstYI | CviJI Earl | | j 

TspRI Sau3AI j MboII | Sapljjl Alul 

Mnll | Bed | | Alwl | j Bed ||| CviJI MboII 

II III I I I I I I I I I I 

CTGGTTAGAAACTGATGGATCTACAATACAGCTCCATCTTCGAGAAGAGCTTTTAGGGTG 

421 + + + + + + 480 

GACCAATCTTTGACTACCTAGATGTTATGTCGAGGTAGAAGCTCTTCTCGAAAATCCCAC 



NlalV 
Avail | 
ECO0109I j 
Psp5II j 
Sau96I | 
II 



Bbvl 
SfaNI 
BsrI | 
I I 



Fnu4HI 
Tsel | 
Mwol | | 
I I 



Alul 
CviJI 
Ddel | 
I I 



GGACCTGTTCCACGCCTTACTGGTCTGTTCTGCTAATGATGCTGCGAATGTCTTAGCTAT 



CCTGGACAAGGTGCGGAATGACCAGACAAGACGATTACTACGACGCTTACAGAATCGATA 



Sfd 
Dpnl | 
BstYI 
Sau3AI 
Acil | 
Nlalll | j 
Nspl | | 
I ' 



Alul 
CviJI 
MboII 
I 



XmnI 

I 



Bbvl 
Bsgl| 
Msel | | 



GGCATGTTGCGGATCTGTAGAGAAGTTTATGGATAAGCTGAACTTCTTCTTAAAAGAAGA 



CCGTACAACGCCTAGACATCTCTTCAAATACCTATTCGACTTGAAGAAGAATTTTCTTCT 
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Fig. 12 (con't) 



CviRI 
MboII 
Fnu4HI | 
Cvi JI | | 
Tsel j | 
III 



Maelll 
Nlalll | 
BsaJI 
BstDSI 
Ncol 
Styl 
Fokl | 
I 



Hinf I 
Tfil 
Hpyl8 8IX | 

I I 



AATCGGCTGCACTCATACCCATTTTAATAATCCCCATGGGTTACATCATCCGAATCACTA 

+ + + + + + ( 

TTAGCCGACGTGAGTATGGGTAAAATTATTAGGGGTACCCAATGTAGTAGGCTTAGTGAT 



Sthl32I 

Dpnl| Hpyl8 8IX Mnll 

Sau3AI | | SfaNI | BslI | 

BscGI | || Nlalll | Hhal | Mnll TaqI | | 

I I II I I I I I Ml 

TACTACAACCCGTGATCTTATTAGCATCATGCGTTGCGCTCTGAAAGAACCTCCATTTCG 

661 + + + + + + 720 

ATGATGTTGGGCACTAGAATAATCGTAGTACGCAACGCGAGACTTTCTTGGAGGTAAAGC 



Dpnl 
NlalV 
BamHI 
BstYI 
Sau3AI 
Alwl 
BspMI | 

Hin4I Alul Nlalll | | 

SimI CviJI CviJI CviRI | | | 

I I I I III 

AGGGGTCATCTCCACGACAAGCTATAAAATAGGGGCTACAAACCTGCATGGCGAACGGAT 

721 + + + + + + 780 

TCCCCAGTAGAGGTGCTGTTCGATATTTTATCCCCGATGTTTGGACGTACCGCTTGCCTA 



MboII 
Tsp509I 



AccI 
SimI | 



ScrFI 
BsaJI | 
EcoRII | | 
Tthlllll j j 
III 



BslI 
Alul | 
CviJI j 
Mnll j 
I I 



CCTATCCCCAACAAACAAATTGCTTCTTCCTGGGTCTACCTACCACTATCCCCCAGCTTT 



GGATAGGGGTTGTTTGTTTAACGAAGAAGGACCCAGATGGATGGTGATAGGGGGTCGAAA 



NlalV 
Avail | 
Sau96I j 

II 



PstI 
CviRI | 
BsmFI | | 
Sfcl|| | 
III I 



Tsp509I 
Bbvl | 
I I 



Fnu4HI 
CviJI | 
Tsel j 



AGGAGGGAAAACAGGGACCACCAAGACTGCAGGGAAAAATCTAATTATGGCTGCTGAAAA 
TCCTCCCTTTTGTCCCTGGTGGTTCTGACGTCCCTTTTTAGATTAATACCGACGACTTTT 
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Fig. 12 (con't) 



CviJI 
BscGI 
Sthl32I | 
BsiEI 



Pvul 
Dpnl | 
Sau3AI | j 
Sthl32I | j 
Mae I I I | | | 
Acil BslI |MnlI j j j 
I llll" 



NlalV 
Avail | 
Eco0109I j 
Psp5II j 
Sau96I | 
SimI | 
II 



Dpnl 
Sau3AI | 

I I 



AAATAACCGCCTCTTGGTAACGATCGCAACGGGCTATTCGGGTCCTGTGAGTGATCTCTA 

901 + + + + + + 960 

TTTATTGGCGGAGAACCATTGCTAGCGTTGCCCGATAAGCCCAGGACACTCACTAGAGAT 



BsrDI 

I 



Taal Msel 
I 



Acil 
Fnu4HI 

Taul BsmFI 
CviJI | Msel 
II I 



Banll 
BsiHKAI 
Bspl286I 
SacI 
Alul | 
CviJI j 
I I 



CCAAGATGTCATTGCTCTATGTGAAACGGTATTTAACGAGCCGCTATTAAGAAAAGAGCT 
GGTTCTACAGTAACGAGATACACTTTGCCATAAATTGCTCGGCGATAATTCTTTTCTCGA 



BsmAI 
Tsp509I 
Mnll | 
8IX Taal | | 
I III 



Hinfl 
Tfil 
Mmel | 
II 



BseMII 
MboII | 
Alul | j 
CviJI | j 
Hindlll | | j 
III' 



CGTCCCCCCCTCCGACTGTCTCCAATTAGAAATAGCGAATCTTGGGAAGCTTTCTTGCCC 

102 1 + + + + + + 1080 

GCAGGGGGGGAGGCTGACAGAGGTTAATCTTTATCGCTTAGAACCCTTCGAAAGAACGGG 



Hinfl Hpyl78III 

Mnll | Nrul 

Bsu3 6I | j Thai 

Ddel j j Mnll | 

Earl j | Dpnl | | 

Hpyl78III j | Sau3AI | j j 

Mnll Plel j j BsmFI Hpyl88IX | j j j MboII Mnll 

llll I I I II I I I 

TCTTCCTGAGGGACTCTACTATGACTTCTATGCCTCCGAAGATCGCGAACCTCTTTCTGT 

3.081 + + + + + + H40 



AGAAGGACTCCCTGAGATGATACTGAAGATACGGAGGCTTCTAGCGCTTGGAGAAAGACA 
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Fig. 12 (con't) 



BsaHI 
Acil | 
Nlalll j 
CviRI Nspl | 

I I I 



Dpnl 
Bglll | 
BstYI 
Sau3AI 
MboII | 
I I 



ATCTTTTATTGCACATGCGGACGCCTTCCCTATTGAACAAGGAGATCTTCTTGGTCATTG 

1141 + + + + + + 1200 

TAGAAAATAACGTGTACGCCTGCGGAAGGGATAACTTGTTCCTCTAGAAGAACCAGTAAC 

Apol 
Tsp509I 
MboII | CviJI 

I I I 
GGTTTTTTATGACGATGAAGGCAAGAAAATTTCTTCCCAGCCTTTCTATGCCCCTTGTCG 

120 1 + + + + + + 1260 

CCAAAAAATACTGCTACTTCCGTTCTTTTAAAGAAGGGTCGGAAAGATACGGGGAACAGC 



Hhal 
I 



BsaJI 
Styl 
CviJI | 
II 



AflHI 
Maell 
Bbsl | 
MboII | 
I I 



TTTTGAGCGCACTATCAAGCCTTGGAAACTCTATATGAAACGTGTCTTCACATCGTATAG 

126 1 + + + + + + 1320 

AAAACTCGCGTGATAGTTCGGAACCTTTGAGATATACTTTGCACAGAAGTGTAGCATATC 



Nlalll 
Fokl 
Fnu4HI 
Nlalll | 
Tsel | 
I 



Acil 
Mwol 
SfaNI | 
Cac8l| | 
Acil | | | 
I III 



AACCTATATGTCTATAACCATGCTGCTCATGTATTTTCGCATCCGCAAGCACCGCAAGTA 

132 i + + + + + + 1380 

TTGGATATACAGATATTGGTACGACGAGTACATAAAAGCGTAGGCGTTCGTGGCGTTCAT 

Apol 

Tsp509I Dral Tsp509I 

Tthlllll |Msel| Msel| 

I I II M 
TAAAAATTTAAAACACTATTCTAAAATCTAACTTTTTCTTTTAATTTATAAAAAACCAAA 

1381 + + + + + + I 440 

ATTTTTAAATTTTGTGATAAGATTTTAGATTGAAAAAGAAAATTAAATATTTTTTGGTTT 



CjePI 

CjePI Hinfl Hhal | 

Hhal I Tfil Fspl|Mmel| 

I I I M M 

GGTTTATGTAAGATTTGCGCTTTTCAATCCAACAAGAATCCCTTGTGCGCACATTACTTT 

1441 + + + + + + 1500 

CCAAATACATTCTAAACGCGAAAAGTTAGGTTGTTCTTAGGGAACACGCGTGTAATGAAA 
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aaggagcaaa tggagattgg ccaaatagac gagcaagggt ttgcataaga atagcctttt 60 

tcgcaataat aacttgccta aacgatcttg taaacgactt atg get tct aat ccc 115 

Met Ala Ser Asn Pro 
1 5 



att tta cag ata gag gat eta tec ata acc ttg gca aaa caa cgc caa 163 

lie Leu Gin lie Glu Asp Leu Ser lie Thr Leu Ala Lys Gin Arg Gin 

lie Leu Gin lie Glu Asp Leu Ser lie Thr Leu Ala Lys Gin Arg Gin 
10 15 20 



cag tac ccc ate gtc caa tct tta teg ttt act ate aat gaa gga caa 211 
Gin Tyr Pro lie Val Gin Ser Leu Ser Phe Thr lie Asn Glu Gly Gin 
Gin Tyr Pro lie Val Gin Ser Leu Ser Phe Thr lie Asn Glu Gly Gin 
25 30 35 



acc tta gca ate att gga gaa tea gga tea gga aaa tct gtc tct gcg 25 9 

Thr Leu Ala He He Gly Glu Ser Gly Ser Gly Lys Ser Val Ser Ala 

Thr Leu Ala He He Gly Glu Ser Gly Ser Glv Lys Ser Val Ser Ala 

40 45 50 



cat gca ate ctt cga tta ctt cct tgc ccc cca ttt tct gtt tct ggc 3C7 
His Ala He Leu Arg Leu Leu Pro Cys Pro Pro Phe Ser Val Ser Gly 
His Ala He Leu Arg Leu Leu Pro Cys Pro Pro Phe Ser Val Ser Glv 
55 60 65 



cag gtc aac ttc caa ggc cac aac 
Gin Val Asn Phe Gin Gly His Asn 
Gin Val Asn Phe Gin Gly His Asn 
70 75 

caa aaa aag att ata ggg aca gaa 

Gin Lys Lys He lie Gly Thr Glu 

Gin Lys Lys He He Gly Thr Glu 
90 



tta ctt acg get teg cgc tct ata 
Leu Leu Thr Ala Ser Arg Ser He 
Leu Leu Thr Ala Ser Arg Ser He 
80 85 

att tct atg ate ttt caa aac ccg 
He Ser Met He Phe Gin Asn Pro 
He Ser Met He Phe Gin Asn Pro 
95 100 



caa gca tct eta aac ccc gtg ttt 
Gin Ala Ser Leu Asn Pro Val Phe 
Gin Ala Ser Leu Asn Pro Val Phe 
105 

att att cat acc cac eta gee tta 
He He His Thr His Leu Ala Leu 
He He His Thr His Leu Ala Leu 
120 125 



act att gaa cag cag ttt cga gaa 
Thr He Glu Gin Gin Phe Arg Glu 
Thr He Glu Gin Gin Phe Arg Glu 
110 115 

act gca gaa gtt get aaa gaa aag 
Thr Ala Glu Val Ala Lys Glu Lys 
Thr Ala Glu Val Ala Lys Glu Lys 
130 



atg tta tac get ctt gaa gaa aca ggg ttt cat gat ccc agg ctg tgc 54" 

Met Leu Tyr Aia Leu Glu Giu Thr Gly Phe His Asp Pro Arg Leu Cys 

Met Leu Tyr Aia Leu Giu Glu Thr Gly Phe His Asd Pro Arg Leu Cys 
135 140 145 



ttg aat etc tac ccc 
Leu Asn Leu Tyr Pro 
Leu Asn Leu Tyr Pro 
150 



ca-c. caa etc tct gga ggg 
His Gin Leu Ser Gly Gly 
Hi3 Gin Leu Ser Glv Gly 
155 160 



atg ctt caa aga att 595 
Met Leu Gin Arg lie 
Met Leu Gin Arg He 
165 
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Fig. 13 (con't) 



tgc att gcc atg gcg etc etc tgt tct cct aaa ctt ctt att get gat 643 

Cys lie Ala Met Ala Leu Leu Cys Ser Pro Lys Leu Leu lie Ala Asp 

Cys lie Ala Met Ala Leu Leu Cys Ser Pro Lys Leu Leu lie Ala Asp 
170 175 180 



gaa cct acg act get tta gat gtt tct gtt cag tat cag att eta caa 
Glu Pro Thr Thr Ala Leu Asp Val Ser Val Gin Tyr Gin lie Leu Gin 
Glu Pro Thr Thr Ala Leu Asp Val Ser Val Gin Tyr Gin III lIu Gxn 
185 190 195 

tta eta aaa aca eta cag aaa aaa acg gga atg age ctt ctt att att 

Leu Leu Lys Thr Leu Gin Lys Lys Thr Gly Met Ser Leu Leu lie lie 

Leu Leu Lys Thr Leu Gin Lys Lys Thr Gly Met Ser Leu Leu lie lie 
200 205 210 

acc cat aat atg gga gtc gtt gca gaa act get gat gac gtg etc ata 

Thr Hxs Asn Met Gly Val Val Ala Glu Thr Ala Lp Asp Val Leu Val 

Thr Hxs Asn Met Gly Val Val Ala Glu Thr Ala Asp Asp Val Leu Val 
215 220 225 

T CtC ~ 3t ?? a g ? 3 CgC 3tg gta gaa t ^ t 9cc cct gcg gtt caa atq ttc 

Leu Tyr Ala Gly Arg Met Val Glu Cys Ala Pro 111 ?al Gin Me? p£e 

Leu Tyr Ala Gly Arg Met Val Glu Cys Ala Pro Ala Val Gin Met Phe 

2J0 235 240 245 

cat aat cct tct cat ccc tat acc cga gat ctt tta gca tec aqa c— 

Hxs Asn Pro Ser His Pro Tyr Thr Arg Asp Leu Leu Ala Ser Arg Pro 

Hxs Asn Pro Ser His Pro Tyr Thr Arg Asp Leu Leu Ala Ser A^g Pro 

250 25a 260 

tct eta caa ccg caa caa eta ggt tec ttc aac ccc att cca gga caa 

Ser Leu Gin Pro Gin Gin Leu Gly Ser Phe Asn Pro lie Pro Gly Gin 

Ser Leu Gxn Pro Gin Gin Leu Gly Ser Phe Asn Pro He Pro Glv Gin 
265 270 275 

ccc cca cac tac acg gcc ttt ccc teg gga tgt cgc tat cac cct aaa 

Pro Pro Hxs Tyr Thr Ala Phe Pro Ser Gly Cys Arg Tyr His P^o lla 

Pro Pro His Tyr Thr Ala Phe Pro Ser Gly Cys Arg Tyr His Pro £g 

'OU 285 290 

Cvs ttt I 3 *- 3 a r t" 3 « 3t n ga ^ gt tCt gCg gaa 9 ct cca 9*a ^c tat 

Cys Ser Lys lie Leu Asn Arg Cys Ser Ala Glu Ala Pro Glu He Tvr 

Cys Ser Lys He Leu Asn Arg Cys Ser Ala Glu Ala Pro Glu n- Tyr 
300 305 

ccg gta cgc gaa ggt cac aaa gta agg gtt ggc tgt atg acg ac+- aat 

Pro Val Arg Glu Gly His Lys Val Arg Val Gly Cys Me? Th- Thr Isn 

Pro Val Arg Glu Gly His Lys Val Arg Val Gljf Cys Met Thr Jhr Asn 

315 320 3* 25 

ttt ccc caa cct tta att caa gca acc tea tta aca aag cac tat tac 1123 

Phe Pro Gin Pro Leu Ile-Gln Ala Thr Ser Leu Thr Ly? His Ty- 1123 

Phe Pro Gin Pro Leu lie Gin Ala Thr Ser Leu Thr Lvs His t£ Tyr 

330 335 * 3 io Y 



1075 
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Fig. 13 (con't) 

aag cgt tec ttt tgg ttt cag gga aag aca att gec agt cgt cct gtt 

Lys Arg Ser Phe Trp Phe Gin Gly Lys Thr He Ala Ser Arg Pro Val 

Lys Arg Ser Phe Trp Phe Gin Gly Lys Thr He Ala Ser Arg Pro Val 
345 350 355 

gac gac gtc tct ttt tea eta tac tec aga cgt get gtc gga ctt att 

Asp Asp Val Ser Phe Ser Leu Tyr Ser Arg Arg Ala Val Gly Leu He 

Asd Asp Val Ser Phe Ser Leu Tyr Ser Arg Arg Ala Val Gly Leu He 
360 365 370 

gga gaa tct gga tea ggg aaa agt ace ctg gcg tta get etc gca ggt 
Gly Glu Ser Gly Ser Gly Lys Ser Thr Leu Ala Leu Ala Leu Ala Gly 
Gly Giu Ser Gly Ser Gly Lys Ser Thr Leu Ala Leu Ala Leu Ala Gly 

375 380 385 

etc eta cct etc acc tct ggg ttc tta act ttt aac ggc acc cca ate 
Leu Leu Pro Leu Thr Ser Gly Phe Leu Thr Phe Asn Gly Thr Pro He 
Leu Leu Pro Leu Thr Ser Gly Phe Leu Thr Phe Asn Gly Thr Pro He 
390 395 400 405 

aag ttg cat tct aaa cac gga cgc cat caa tta cga tct caa gta egg 

Lys Leu His Ser Lys His Gly Arg His Gin Leu Arg Ser Gin Val Arg 

Lys Leu His Ser Lys His Gly Arg His Gin Leu Arg Ser Gin Val Arg 

410 415 420 

ttg gtc ttt caa aat cca caa get tea tta aac ccg cga aaa act ate 

Leu Val Phe Gin Asn Pro Gin Ala Ser Leu Asn Pro Arg Lys Thr He 

Leu Val Phe Gin Asn Pro Gin Ala Ser Leu Asn Pro Arg Lys Thr He 

425 430 435 

eta gat agt tta ggc cac tct ctg ctt tac cat aaa etc gtc cca aaa 

Leu Asp Ser Leu Gly His Ser Leu Leu Tyr His Lys Leu Val Pro Lys 

Leu Asp Ser Leu Gly His Ser Leu Leu Tyr His Lys Leu Val Pro Lys 

440 445 450 

gaa aaa gta eta gca acg gta agg gaa tat tta gaa ttg gta ggg tta 

Glu Lys Val Leu Ala Thr Val Arg Glu Tyr Leu Glu Leu Val Gly Leu 

Glu Lys Val Leu Ala Thr Val Arg Glu Tyr Leu Glu Leu Val Gly Leu 

455 460 465 

tct gag gag tat ttt tat cgt tat cct cac cag ctt tct gga gga caa 

Ser Glu Glu Tyr Phe Tyr Arg Tyr Pro His Gin Leu Ser Gly Gly Gin 

Ser Glu Glu Tyr Phe Tyr Arg Tyr Pro His Gin Leu Ser Gly Gly Gin 

470" 475 480 485 

caa caa cca gtc tct ata gcg aga gec eta tta gga gtc cct cag tta 

Gin Gin Arg Val Ser He Ala Arg Ala Leu Leu Gly Val Pro Gin Leu 

Gin Gin Arg Val Ser lie Ala Arg Ala Leu Leu Gly Val Pro Gin Leu 
490 495 500 

att att tgt gac gaa att gtt tct gcz era gat tta tct att caa gca 

He He Cys Asd Glu He Val Ser Ala Leu Asp Leu Ser He Gin Ala 

T~e He Cvs Asd Glu He Val Ser Ala Leu Asp Leu Ser He Gin Ala 

* 505 510 515 

caa att ctg aat atg ctt gec gag ctg caa aaa aaa etc age etc aca 

Gin He Leu Asn Met Leu Ala Glu Leu Gin Lys Lys Leu Ser Leu Thr 

Gin He Leu Asn Met Leu Ala Glu Leu Gin Lys Lys Leu Ser Leu Thr 

520 525 530 
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Fig. 13 (con't) 



tat etc ttc att teg cat gat ctt gec gtt gta cgc teg ttc tgc aca 1747 

Tyr Leu Phe lie Ser His Asp Leu Ala Val Val Arg Ser Phe Cys Thr 

Tyr Leu Phe lie Ser His Asp Leu Ala Val Val Arg Ser Phe Cys Thr 
535 540 545 



gag gta ttc att 
Glu Val Phe He 
Glu Val Phe He 
550 



atg tat aag ggg 
Met Tyr Lys Gly 
Met Tyr Lys Gly 
555 



caa att gta gaa 
Gin He Val Glu 
Gin He Val Glu 
560 



Lys Gly Asn Thr 
Lys Gly Asn Thr 
565 



aaa cgc att ttt tct gat cca caa cat cct tat acg cgc atg ttg tta 1843 

Lys Arg He Phe Ser Asp Pro Gin His Pro Tyr Thr Arg Met Leu Leu 

Lys Arg lie Phe Ser Asp Pro Gin His Pro Tyr Thr Arg Met Leu Leu 
570 575 580 



aat gec caa ctt cca gag act cct gat caa agg caa tct aaa cct ata 1891 

Asn Ala Gin Leu Pro Glu Thr Pro Asp Gin Arg Gin Ser Lys Pro lie 

Asn Ala Gin Leu Pro Glu Thr Pro Asp Gin Arg Gin 

585 590 595 

ttc caa gaa tat cac aaa gat tct gaa gaa tct tgc tct aca gga tgc 193 9 
Phe Gin Glu Tyr His Lys Asp Ser Glu Glu Ser Cys Ser Thr Glv Cvs 
600 605 610 Y 

tac ttt tac aat cgt tgt cca caa aaa caa gaa get tgc aag tea gag 198 7 
Tyr Phe Tyr Asn Arg Cys Pro Gin Lys Gin Glu Ala Cys Lys Se- Glu 
615 620 625 

ate ate cca aat caa gga gac gcg cac cat aca tac cgt tgt ate cat 2035 
lie lie Pro Asn Gin Gly Asp Ala His His Thr Tyr Arg Cys lie His 
630 635 640 6 45 

tgattegtec tetaegctat tcttaagcta ccattaagga atcccaaggg agaggtctgc 20 95 

tCtat 2100 
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Figure 14 (RY-40) 

Restriction enzyme analysis of CPN 100515 

CviJI 
Hael 
Haelll 
MscI 

Eael I CviRI CviJI Mwol 

I I I I I 

AAGGAGCAAATGGAGATTGGCCAAATAGACGAGCAAGGGTTTGCATAAGAATAGCCTTTT 

! + + + + + + i 

TTCCTCGTTTACCTCTAACCGGTTTATCTGCTCGTTCCCAAACGTATTCTTATCGGAAAA 

Dpnl 
Cjel| 

Sau3AI | I CviJI Cjel 

III I I 

TCGCAATAATAACTTGCCTAAACGATCTTGTAAACGACTTATGGCTTCTAATCCCATTTT 

61 + + + + + + : 

AGCGTTATTATTGAACGGATTTGCTAGAACATTTGCTGAATACCGAAGATTAGGGTAAAA 



Dpnl 
Hin4I 

BstYI | Bed 

Sau3AI j BstXI CjePI | 

HaelV | j BsaJI | Rsal | j 

Mnll Hin4I | j Alwl Styl j Taal | j | 

I I I I I II I I I I 

ACAGATAGAGGATCTATCCATAACCTTGGCAAAACAACGCCAACAGTACCCCATCGTCCA 

121 + + + + + + 180 

TGTCTATCTCCTAGATAGGTATTGGAACCGTTTTGTTGCGGTTGTCATGGGGTAGCAGGT 

Dpnl 
Sau3AI 
Hpyl78III | 
BpulOI Hinfl | j 

Ddel Tfil | | 

I I I M 

ATCTTTATCGTTTACTATCAATGAAGGACAAACCTTAGCAATCATTGGAGAATCAGGATC 

181 + + + + + + 240 

TAGAAATAGCAAATGATAGTTACTTCCTGTTTGGAATCGTTAGTAACCTCTTAGTCCTAG 



CjePI 



Alwl 
Hpyl78III | 
I 



CviRI 
Nlalll 
Nspl 
SphI 
Cac8I 
Hhal | 
Fspl| | 
BsmAI | | | 
I II I 



AGGAAAATCTGTCTCTGCGCATGCAATCCTTCGATTACTTCCTTGCCCCCCATTTTCTGT 

241 + + + + + + 300 

TCCTTTTAGACAGAGACGCGTACGTTAGGAAGCTAATGAAGGAACGGGGGGTAAAAGACA 
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Fig. 14 (con't) 



CviJI 
Hael 
Haelll 
BsaJI 



Cj. 
Hindi 
ScrFI | 
CviJI 
EcoRII 
Hael 
Hael I I 
MscI 
Eael | 
I ■ 



Hhal 

CviJI Thai | Bcefl 
III I 



TTCTGGCCAGGTCAACTTCCAAGGCCACAACTTACTTACGGCTTCGCGCTCTATACAAAA 

301 + + + + ---- - + + 360 

AAGACCGGTCCAGTTGAAGGTTCCGGTGTTGAATGAATGCCGAAGCGCGAGATATGTTTT 



Dpnl 

Apol Sau3AI | 
Tsp509I BsmFI | j 
I I 



Faul 

Sthl32l| BscGI 

Cac8I | | Tthlllll 

Acil | | | SfaNI | 

I I II II 



Sthl32I 



XmnI 
Tsp509I | 
Hpyl78III | j 
Taql | | | 
I I I I 



AAAGATTATAGGGACAGAAATTTCTATGATCTTTCAAAACCCGCAAGCATCTCTAAACCC 

361 + + + + + + 420 

TTTCTAATATCCCTGTCTTTAAAGATACTAGAAAGTTTTGGGCGTTCGTAGAGATTTGGG 

CviRI 
Sfcl | 
Mwol | 
Msel | | 
CviJI | | j 
Bfal | | | | 

I I I I I I I I M 

CGTGTTTACTATTGAACAGCAGTTTCGAGAAATTATTCATACCCACCTAGCCTTAACTGC 

421 + + + + + + 480 

GCACAAATGATAACTTGTCGTCAAAGCTCTTTAATAAGTATGGGTGGATCGGAATTGACG 

ScrFI 
BsaJI 
EcoRII 
Tthlllll 

Dpnl 
Nlalll' 
Sau3AI 
Hpyl78III 
Real | 
Alwl | | 
MboII j j 
I I I 



BstAPI 
Mwol 
PstI | 

I I 



Hpyl78III 
I 



481 



AGAAGTTGCTAAAGAAAAGATGTTATACGCTCTTGAAGAAACAGGGTTTCATGATCCCAG 
TCTTCAACGATTTCTTTTCTACAATATGCGAGAACTTCTTTGTCCCAAAGTACTAGGGTC 
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Fig. 14 (con't) 



CjePI 
Hinfl | 
Tfil | 

I 



Hpyl78III 
BstXI | 
SfaNI | 
Mnll| | 
Xcml j j 
II I 



BsrDI 
CviRI 
Bpml 
Fokl 
Apol | 
Tsp509I | 
CjePI | | 
II 



GCTGTGCTTGAATCTCTACCCCCACCAACTCTCTGGAGGGATGCTTCAAAGAATTTGCAT 

+ + + + + + ( 

CGACACGAACTTAGAGATGGGGGTGGTTGAGAGACCTCCCTACGAAGTTTCTTAAACGTA 



Haell 
Hhal | 



Nlalll 
BsaJI 
BstDSI 
Ncol 
Styl 
BseRI | 
MslI j 
II 



601 



TGCCATGGCGCTCCTCTGTTCTCCTAAACTTCTTATTGCTGATGAACCTACGACTGCTTT 
ACGGTACCGCGAGGAGACAAGAGGATTTGAAGAATAACGACTACTTGGATGCTGACGAAA 



Hinfl 

Tfil Sthl32I 
Hpyl88IX |Tsp509I Sfcl | BscGI 

III II I 

AGATGTTTCTGTTCAGTATCAGATTCTACAATTACTAAAAACACTACAGAAAAAAACGGG 

TCTACAAAGACAAGTCATAGTCTAAGATGTTAATGATTTTTGTGATGTCTTTTTTTGCCC 



Hinfl 
BslI | 
PflMI j 
I I 



AlwNI 
BstAPI 
CviRI | 
Plel Mwol Maell 
I I I 



AATGAGCCTTCTTATTATTACCCATAATATGGGAGTCGTTGCAGAAACTGCTGATGACGT 
TTACTCGGAAGAATAATAATGGGTATTATACCCTCAGCAACGTCTTTGACGACTACTGCA 



BsiHKAI 
Bspl286I 
BsiHKAI | 
Bspl286I j 

BssSl| j CviRI 

II II 



Hgal 
Nlalll | 

I I 



Bspl286I 
Bmgl | 
BseSI |AciI CjePI 
I I I 



781 



GCTCGTGCTCTATGCAGGACGCATGGTAGAATGTGCCCCTGCGGTTCAAATGTTCCATAA 

+ + + + + + ! 

CGAGCACGAGATACGTCCTGCGTACCATCTTACACGGGGACGCCAAGTTTACAAGGTATT 
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Fig. 14 (con't) 



CjePI 
Sthl32I j 
Dpnl 
Bglll 



BstYI 
Sau3AI 
CjePI | 
Fokl | j 
Aval | | | 
III I 



SfaNI 
SimI | 
Hpyl78IIl| | 
I II I 



Acil 
Mnll | 
II 



TCCTTCTCATCCCTATACCCGAGATCTTTTAGCATCCAGACCCTCTCTACAACCGCAACA 

841 + + + + + + ■ 

AGGAAGAGTAGGGATATGGGCTCTAGAAAATCGTAGGTCTGGGAGAGATGTTGGCGTTGT 

Bcefl 
Hpyl78III 
Aval | 
BsaJI 
RleAI 

ScrFI Sthl32I | 

Bfal Bsll| CviJl| | 

CiePllNlalV EcoRIl|| CviJI Haelll | | 

|| I III I M I 

ACTAGGTTCCTTCAACCCCATTCCAGGACAGCCCCCACACTACACGGCCTTTCCCTCGGG 



901 



TGATCCAAGGAAGTTGGGGTAAGGTCCTGTCGGGGGTGTGATGTGCCGGAAAGGGAGCCC 



SfaNI 
Mnll | Bfal 
HphI | |FokI I 

I I I I I 



Bpml 
Clal | 
Apol Dral | j 
Tsp509I Msel|TaqI | 
I II I I 



Hpyl78III 
Alul | 
Acil CviJI j 
I I I 



ATGTCGCTATCACCCTAGATGCTCAAAAATTTTAAATCGATGTTCTGCGGAAGCTCCAGA 
TACAGCGATAGTGGGATCTACGAGTTTTTAAAATTTAGCTACAAGACGCCTTCGAGGTCT 



Maelll 
Tsp45I 
Mspl Thai | 
BsaWl|RsaI | | CviJI Tsp509I 

II I I I I I 

AATCTATCCGGTACGCGAAGGTCACAAAGTAAGGGTTGGCTGTATGACGACTAATTTTCC 

1021 + + + + + + 1080 

TTAGATAGGCCATGCGCTTCCAGTGTTTCATTCCCAACCGACATACTGCTGATTAAAAGG 

Mnll 

Tsp509I Tthlllll| 
Msel I Msel I j 

II III 

CCAACCTTTAATTCAAGCAACCTCATTAACAAAGCACTATTACAAGCGTTCCTTTTGGTT 

108 i + + + + + + H40 

GGTTGGAAATTAAGTTCGTTGGAGTAATTGTTTCGTGATAATGTTCGCAAGGAAAACCAA 
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Fig. 14 (con't) 



Muni 
Tsp509I 



BsmAI 
BsmBI 
Bpml 
Aatll | 
BsaHI 
Hin4I 
Maell 
Tthllll | 
Hindi | | 
II 



Hpyl78III 
Mmel | 

I I 



TCAGGGAAAGACAATTGCCAGTCGTCCTGTTGACGACGTCTCTTTTTCACTATACTCCAG 



AGTCCCTTTCTGTTAACGGTCAGCAGGACAACTGCTGCAGAGAAAAAGTGATATGAGGTC 



Hpyl8 8IX 
Ahdl | 
HaeIV| | 
Hin4l| j 
Mae 1 1 | | | 
I II 



Dpnl 
Sau3AI | 
Hpyl78IIl| | 
Hinfl | | j 
Tfil | | | 

I II 



ScrFI 
BsaJI | Alul 
EcoRII j CviJI 
Rsal | | BspMI | 

III II 



1201 



ACGTGCTGTCGGACTTATTGGAGAATCTGGATCAGGGAAAAGTACCCTGGCGTTAGCTCT 
TGCACGACAGCCTGAATAACCTCTTAGACCTAGTCCCTTTTCATGGGACCGCAATCGAGA 



Bsal 

BsmAI Msel NlalV 

Hphll Mnll Mnll| Msel BanI | Beef I 

II I II III I 

CGCAGGTCTCCTACCTCTCACCTCTGGGTTCTTAACTTTTAACGGCACCCCAATCAAGTT 

126 1 + + + + + + 1320 

GCGTCCAGAGGATGGAGAGTGGAGACCCAAGAATTGAAAATTGCCGTGGGGTTAGTTCAA 



Smll 
Dpnl 
Sau3AI 
Hgal | 
Tsp509I | 



BsmI 
CviRI 
I 



Bed 
Bce83I | 
BsaHI | j 

I I 



Taal 
Rsal | 

I I 



GCATTCTAAACACGGACGCCATCAATTACGATCTCAAGTACGGTTGGTCTTTCAAAATCC 

132 i + + + + + + 1380 

CGTAAGATTTGTGCCTGCGGTAGTTAATGCTAGAGTTCATGCCAACCAGAAAGTTTTAGG 



Faul 

Alul Sthl32l| 
CviJI Thai | j 

Hindi I I | Msel Acil | j j 
II I I I II 



CviJI 
Hael 
Haelll 
I 



ACAAGCTTCATTAAACCCGCGAAAAACTATCCTAGATAGTTTAGGCCACTCTCTGCTTTA 



TGTTCGAAGTAATTTGGGCGCTTTTTGATAGGATCTATCAAATCCGGTGAGAGACGAAAT 
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Fig. 14 (con't) 



Bfal 
Rsal | 
Seal j 
TatI | | 
I I I 



Tsp509I 
Sspl | BseMII 

I I 



CCATAAACTCGTCCCAAAAGAAAAAGTACTAGCAACGGTAAGGGAATATTTAGAATTGGT 

1441 + + + + + + 1500 

GGTATTTGAGCAGGGTTTTCTTTTTCATGATCGTTGCCATTCCCTTATAAATCTTAACCA 



Ddel 
Hpyl8 8IX 
Mnll | 
I I 



BseRI 
HphI 



Hpyl78III 
BstXI 
Mnll 
Mnll | 
Alul I I 
Cvi JI j j 
III 



AGGGTTATCTGAGGAGTATTTTTATCGTTATCCTCACCAGCTTTCTGGAGGACAACAACA 

150 i + + + + + + 1560 

TCCCAATAGACTCCTCATAAAAATAGCAATAGGAGTGGTCGAAAGACCTCCTGTTGTTGT 



Plel 
BsmAI 
Sfcl| 
Bpml I j 
Hinfl | j j 
I I II 



Banll 
Bspl286I 
CviJI | 
BsmFI | j 
I I I 



BseMII 
Maelll 
Tsp45I 
Mnll 



Tsp509I 
Msel I 
Plel | | 
Ddel | | | 
I I II 



ACGAGTCTCTATAGCGAGAGCCCTATTAGGAGTCCCTCAGTTAATTATTTGTGACGAAAT 



TGCTCAGAGATATCGCTCTCGGGATAATCCTCAGGGAGTCAATTAATAAACACTGCTTTA 



Bbvl 
Tthlllll 
Acelll | 
Hpyl88IX 

Hpyl78III Tthlllll | 

Bfal | Apol | j 

Xbal| j Tsp509I j j 

III I I I 

TGTTTCTGCTCTAGATTTATCTATTCAAGCACAAATTCTGAATATGCTTGCCGAGCTGCA 

1621 + + + + + + !680 

ACAAAGACGAGATCTAAATAGATAAGTTCGTGTTTAAGACTTATACGAACGGCTCGACGT 



CviRI 
Fnu4HI 
Alul| 
CviJI | 
Tsel j 
Mwol | j 
Cac8I | | | 
I I I 



MboII Dpnl 
CviJI | Mnll Beef I NlaIIl| 
Ddel | j BseMII | Earl | Sau3AI | | Bsgl Rsal Mwol 

I I I II II III II I 

AAAAAAACTCAGCCTCACATATCTCTTCATTTCGCATGATCTTGCCGTTGTACGCTCGTT 

16 81 + + + + + + 1740 

TTTTTTTGAGTCGGAGTGTATAGAGAAGTAAAGCGTACTAGAACGGCAACATGCGAGCAA 
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Fig. 14 (con't) 

CviRI 

Mnl I | Tsp509I 

I I I 

CTGCACAGAGGTATTCATTATGTATAAGGGGCAAATTGTAGAAAAAGGAAATACAAAACG 

1741 + + + + + + 1800 

GACGTGTCTCCATAAGTAATACATATTCCCCGTTTAACATCTTTTTCCTTTATGTTTTGC 



Dpnl 
Sau3AI | 

Hpyl88IX| j Nlalll Hpyl78III 

Fokl | j | Hhal |MseI Plel | 

Alwl| | | | Thai |NspI | BsmAI | | 

II II I I I I I Ml 

CATTTTTTCTGATCCACAACATCCTTATACGCGCATGTTGTTAAATGCCCAACTTCCAGA 

180 i + + + + + + I860 

GTAAAAAAGACTAGGTGTTGTAGGAATATGCGCGTACAACAATTTACGGGTTGAAGGTCT 



Dpnl 

Bell | 

Sau3AI j 

Hpyl78IIl| | 

HaeIV| | | Hpyl88IX 

Hin4l|| | Hinfl | Hinfl 

Hinfl | | | | Tfil | Tfil 

I III I III 
GACTCCTGATCAAAGGCAATCTAAACCTATATTCCAAGAATATCACAAAGATTCTGAAGA 

1861 + + + + + + 1920 

CTGAGGACTAGTTTCCGTTAGATTTGGATATAAGGTTCTTATAGTGTTTCTAAGACTTCT 

CviRI 
Fokl | 
Cac8l| 
Alul | | 
CviJI j j 

Fokl Hindi 1 1 | | I 

I I I III 

ATCTTGCTCTACAGGATGCTACTTTTACAATCGTTGTCCACAAAAACAAGAAGCTTGCAA 

192 i + + + + + + 1980 

TAGAACGAGATGTCCTACGATGAAAATGTTAGCAACAGGTGTTTTTGTTCTTCGAACGTT 



Sfcl 
MboII | 
SfaNI | |Eco57I 
II I 



Dpnl BslI Bael BciVI 

Sau3AI | BsmAI | Hhal Hinfl | 

Hpyl88IX | j BsmBI j Thai |HgaI Taal Tfil j 

I I I I I III I M 

GTCAGAGATCATCCCAAATCAAGGAGACGCGCACCATACATACCGTTGTATCCATTGATT 

1981 + + + + + + 2040 

CAGTCTCTAGTAGGGTTTAGTTCCTCTGCGCGTGGTATGTATGGCAACATAGGTAACTAA 
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Fig. 14 (con't) 



Bael 
Hin4I | 
I I 



Alul 
CviJI 
Msel 
Af III | 
Smll | 
Mnll | | 
I II 



Mnll 
BsaJI | 
Hinfl | j 
Msel Tfil Stylj 
I I II 



Hin4I 
I 



CGTCCTCTACGCTATTCTTAAGCTACCATTAAGGAATCCCAAGGGAGAGGTCTGCTCTAT 
GCAGGAGATGCGATAAGAATTCGATGGTAATTCCTTAGGGTTCCCTCTCCAGACGAGATA 
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Figure 15: 

cgaagagcaa acctccacag ttacagagaa agacgtccaa cctaaaacac aagcaacacc 6 0 

acacgcttcg aagaaaaacg ttgcaagtcc ttcgacctct atg cca gga ate gag 115 

Met Pro Gly lie Glu 



aaa gca gca aca aca gtg get gta cct caa gac aaa tct gaa gaa gaa 
Lys Ala Ala Thr Thr Val Ala Val Pro Gin Asp Lys Ser Glu Glu Glu 



aaa gtt aaa gag cga ttg aca aag egg gaa ctt acc tgt gaa gac ctt 
Lys Val Lys Glu Arg Leu Thr Lys Arg Glu Leu Thr Cys Glu Asp Leu 



aaa gat aac ggc tat act gtc aat ttt gaa gac att tct att tta gag 
Lys Asp Asn Gly Tyr Thr Val Asn Phe Glu Asp He Ser He Leu Glu 



ttg ttg cag ttc gta agt aaa att tct gga acg aac ttt gtc ttt gat 
Leu Leu Gin Phe Val Ser Lys He Ser Gly Thr Asn Phe Val Phe Asp 



age aac gat ttg caa ttc aat gtc acg ate gtt tec cac gat cct act 
Ser Asn Asp Leu Gin Phe Asn Val Thr He Val Ser His Asp Pro Thr 



tct gta gat gat tta tct aca ate tta eta caa gtc tta aaa atg cat 
Ser Val Asp Asp Leu Ser Thr He Leu Leu Gin Val Leu Lys Met His 
90 95 100 

gac ttg aag gtt gtt gaa caa ggc aat aac gtc ctt ate tat cgt aat 
Asp Leu Lys Val Val Glu Gin Gly Asn Asn Val Leu He Tyr Arg Asn 
105 HO H5 

cct cat ctt tct aag eta tec aca gta gtc aca gac age tec tta aaa 
Pro His Leu Ser Lys Leu Ser Thr Val Val Thr Asp Ser Ser Leu Lys 
120 125 130 

gaa acg tgt gaa get gtt gtg gtt acc cga gtg ttc cgt ctt tac agg 
Glu Thr Cys Glu Ala Val Val Val Thr Arg Val Phe Arg Leu Tyr Arg 
135 140 145 

cgt cag ccc tct gca gca gta aat att att caa cct tta ctt tec cat 
Arg Gin Pro Ser Ala Ala Val Asn He He Gin Pro Leu Leu Ser His 
150 155 160 165 

gat get ate gtt agt get tea gaa get act cgt cat gtt ate ate teg 
Asp Ala He Val Ser Ala Ser Glu Ala Thr Arg His Val He He Ser 
170 175 180 

gat att get ggt aat gtc gat aaa gtc agt gat ttg eta gca get eta 
Asp He Ala Gly Asn Val Asp Lys Val Ser Asp Leu Leu Ala Ala Leu 
185 190 195 
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Fig. 15 con't) 

gat tgc cca ggc aca tct gtg gac atg act gaa tac gaa gtt aaa tat 

Asp Cys Pro Gly Thr Ser Val Asp Met Thr Glu Tyr Glu Val Lys Tyr 

200 205 210 

gcc aat ccc gca get ctt gtt age tac tgc caa gat gtt ctt ggt act 
Ala Asn Pro Ala Ala Leu Val Ser Tyr Cys Gin Asp Val Leu Gly Thr 
215 220 225 

ctg gcc gaa gat gat get ttc caa atg ttc ate caa cct gga acg aac 
Leu Ala Glu Asp Asp Ala Phe Gin Met Phe He Gin Pro Gly Thr Asn 
230 235 240 245 

aaa att ttc gtc gtc tct tea cca cgt ctt gca aat aag gca gag cag 
Lys He Phe Val Val Ser Ser Pro Arg Leu Ala Asn Lys Ala Glu Gin 
250 255 260 

etc ctg aag tec tta gat gtc cca gaa atg gca cat acc eta gat gat 
Leu Leu Lys Ser Leu Asp Val Pro Glu Met Ala His Thr Leu Asp Asp 
265 270 275 

cct gca agt act gcc ttg get ttg gga gga aca gga acc acg age cct 
Pro Ala Ser Thr Ala Leu Ala Leu Gly Gly Thr Gly Thr Thr Ser Pro 
280 285 290 

aag agt ttg egg ttc ttt atg tac aag ctg aag tat caa aat gga gaa 
Lys Ser Leu Arg Phe Phe Met Tyr Lys Leu Lys Tyr Gin Asn Gly Glu 
295 300 305 

gtg att get aat gcc etc caa gat ate ggt tac aat eta tat gta ; acc 
Val He Ala Asn Ala Leu Gin Asp He Gly Tyr Asn Leu Tyr Val Thr 
310 315 320 325 

aca get atg gac gaa gat ttc att aac act etc aat agt ate cag tgg 
Thr Ala Met Asp Glu Asp Phe He Asn Thr Leu Asn Ser He Gin Trp 
330 335 340 

tta gag gtc aat aac tec ata gtt att ate gga aac caa ggg aat gtc 
Leu Glu Val Asn Asn Ser He Val He He Gly Asn Gin Gly Asn Val 
345 350 355 

gac aga gtt att ggc etc tta aac ggt tta gat tta cct cct aaa cag 
Asp Arg Val He Gly Leu Leu Asn Gly Leu Asp Leu Pro Pro Lys Gin 
360 365 370 

gtt tac ate gaa gtt tta att eta gat acc age tta gag aaa tec tgg 
Val Tyr He Glu Val Leu He Leu Asp Thr Ser Leu Glu Lys Ser Trp 
375 380 385 

gac ttt gga gtg caa tgg gta gcc eta ggt gat gaa caa agt aaa gta 
Asp Phe Gly Val Gin Trp Val Ala Leu Gly Asp Glu Gin Ser Lys Val 
390 395 400 405 
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Fig. 15 con't) 

get tat get tct gga eta ttg aat aat act ggc ata gec aca cct aca 1363 

Ala Tyr Ala Ser Gly Leu Leu Asn Asn Thr Gly He Ala Thr Pro Thr 

410 415 420 

aaa gca act gtc cct ccc ggc acg cca aat cct ggt teg ate cct ctt 1411 

Lys Ala Thr Val Pro Pro Gly Thr Pro Asn Pro Gly Ser He Pro Leu 

425 430 435 

cct acg cca gga caa ttg aca ggg ttc tea gat atg ctg aac tct teg 1459 

Pro Thr Pro Gly Gin Leu Thr Gly Phe Ser Asp Met Leu Asn Ser Ser 

440 445 450 

tea gca ttc ggt eta gga ate ate gga aat gtc eta agt cat aaa ggg 1507 

Ser Ala Phe Gly Leu Gly He He Gly Asn Val Leu Ser His Lys Gly 

455 460 465 

aag tct ttc ctt act ttg gga ggc tta tta agt gec tta gat caa gat 1555 

Lys Ser Phe Leu Thr Leu Gly Gly Leu Leu Ser Ala Leu Asp Gin Asp 

470 475 480 485 

gga gat act gtc att gtc ttg aat cct aga ate atg get cag gat acg 1603 

Gly Asp Thr Val He Val Leu Asn Pro Arg He Met Ala Gin Asp Thr 

490 495 500 

caa caa get teg ttt ttt gta ggg caa acg gtc cct tac caa act ate 1651 

Gin Gin Ala Ser Phe Phe Val Gly Gin Thr Val Pro Tyr Gin Thr He 

505 510 515 

aaa tac tat ate caa gaa aca gga act gta acg caa aat ate gat tat 1699 

Lys Tyr Tyr He Gin Glu Thr Gly Thr Val Thr Gin Asn He Asp Tyr 

520 525 530 

gaa gat att gga gtg aac ctt gtc gtt acc tct aca gtt get ccc aac 1747 

Glu Asp He Gly Val Asn Leu Val Val Thr Ser Thr Val Ala Pro Asn 

535 540 545 

aat gta gtt aca eta caa ate gaa cag acg ate tea gaa tta cat tec 1795 

Asn Val Val Thr Leu Gin He Glu Gin Thr He Ser Glu Leu His Ser 

550 555 560 565 

gcg tct gga tea eta aca cct gtc aca gat aaa act tat gca gec aca 1843 

Ala Ser Gly Ser Leu Thr Pro Val Thr Asp Lys Thr Tyr Ala Ala Thr 

570 575 580 

cgc tta caa att ccc gac ggt tgt ttc tta gtt atg agt ggg cat ate 1891 

Arg Leu Gin He Pro Asp Gly Cys Phe Leu Val Met Ser Gly His He 

585 590 595 

aga gat aaa act aca aaa gtg gtt tea gga gtg cct ttg eta aac tec 193 9 

Arg Asp Lys Thr Thr Lys Val Val Ser Gly Val Pro Leu Leu Asn Ser 

600 605 610 
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Fig. 15 con't) 



ata cca tta att cgt ggt tta ttt age cgt acc ate gac caa agg caa 1987 
lie Pro Leu lie Arg Gly Leu Phe Ser Arg Thr lie Asp Gin Arg Gin 
615 620 625 

aaa cgc aat ate atg atg ttt att aag cct aag gtg att agt age ttt 2035 
Lys Arg Asn He Met Met Phe He Lys Pro Lys Val He Ser Ser Phe 
630 635 640 645 

gaa gaa ggc act cgt gtt acc aat aag gaa gga tac aga tac aat tgg 2 0 83 
Glu Glu Gly Thr Arg Val Thr Asn Lys Glu Gly Tyr Arg Tyr Asn Trp 
650 655 660 

gaa get gat gaa gga tec atg caa gtg gee cct cgc cat get cct gaa 2131 
Glu Ala Asp Glu Gly Ser Met Gin Val Ala Pro Arg His Ala Pro Glu 
665 670 675 

tgc caa gga cct cct tct tta cag get gaa agt gac ttt aaa ata ata 2179 
Cys Gin Gly Pro Pro Ser Leu Gin Ala Glu Ser Asp Phe Lys He He 
680 685 690 

gaa ata gaa get cag tagtggtata taaaagagga agatgatatt ctccgccgtg 2234 
Glu He Glu Ala Gin 
695 

gaatagcttc tgactctgtt gcattcaggg ggaaagccaa gaagatgtag agtcggccgt 22 94 
ataact 2300 
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Figure 16(RY-41) 

Restriction enzyme analysis of CPN1 00538 

Mnll Aatll 
Maelll | BsaHI | 

MboII | | Maell j 

Hin4I | Taal | BplI | | Bsbl 

I I I I I I I I 

CGAAGAGCAAACCTCCACAGTTACAGAGAAAGACGTCCAACCTAAAACACAAGCAACACC 

! + + + + + + I 

GCTTCTCGTTTGGAGGTGTCAATGTCTCTTTCTGCAGGTTGGATTTTGTGTTCGTTGTGG 



NspV 
TaqI 
Tthlllll | 
Mmel | |BcgI Mae I I 
I I I I 



CviRI 
Bcgl| 
MboII j 
Acll | | 



II 



TaqI 
I 



Fnu4HI 
Tsel | 
Bcgl 
Hpyl78III 
TaqI 



Hinfl 
Tfil 
Bcgl 
Mnll 
ScrFI | 
EcoRII | j 
I II 



ACACGCTTCGAAGAAAAACGTTGCAAGTCCTTCGACCTCTATGCCAGGAATCGAGAAAGC 
TGTGCGAAGCTTCTTTTTGCAACGTTCAGGAAGCTGGAGATACGGTCCTTAGCTCTTTCG 



Hpyl78III 
Smll 



Rsal 
TspRI 
AlwNI 
CviJI 
Bbvl | 
Taal | | 
Bce83I | | | 
I III 



Hpyl8 8IX 
Mnll | 
I I 



Faul 
Sthl32I | 
EC057I | | 

MboII I I I 
Msel | j || 
MboII || I || 
II I I 



AGCAACAACAGTGGCTGTACCTCAAGACAAATCTGAAGAAGAAAAAGTTAAAGAGCGATT 
TCGTTGTTGTCACCGACATGGAGTTCTGTTTAGACTTCTTCTTTTTCAATTTCTCGCTAA 



Acil 
I 



MboII 
Bbsl | 
Msel | CviJI 
I I I 



Beef I 
Tsp509I | 
Taal | j 

I I I 



GACAAAGCGGGAACTTACCTGTGAAGACCTTAAAGATAACGGCTATACTGTCAATTTTGA 

+ + + + + + : 

CTGTTTCGCCCTTGAATGGACACTTCTGGAATTTCTATTGCCGATATGACAGTTAAAACT 
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Fig. 16 (con't) 



Hpyl78III 

MboII Apol | 

Bbsl | CviRI Tsp509I j 

I I I II 

AGACATTTCTATTTTAGAGTTGTTGCAGTTCGTAAGTAAAATTTCTGGAACGAACTTTGT 

241 + + + + + + 300 

TCTGTAAAGATAAAATCTCAACAACGTCAAGCATTCATTTTAAAGACCTTGCTTGAAACA 



BsiEI 
Pvul 
Dpnl | 
Sau3AI 
Hpyl78III | 
Tsp509I Maelll | | 
CviRI | Tsp4 5I j I 
II III 

CTTTGATAGCAACGATTTGCAATTCAATGTCACGATCGTTTCCCACGATCCTACTTCTGT 

301 + + + + + + 360 

GAAACTATCGTTGCTAAACGTTAAGTTACAGTGCTAGCAAAGGGTGCTAGGATGAAGACA 



Dpnl 
Sau3AI | 
Alwl | j Sfcl 

III 



Nlalll 

HaelV Nsil | 

Hin4I Msel CviRI | j 

I I I I I 

AGATGATTTATCTACAATCTTACTACAAGTCTTAAAAATGCATGACTTGAAGGTTGTTGA 

361 + + + + + + 420 

TCTACTAAATAGATGTTAGAATGATGTTCAGAATTTTTACGTACTGAACTTCCAACAACT 



Alul 

CviJI Maelll 
Ddel | Tsp45I 
Mae I I Mnll | Taal | 

I I I I I 

ACAAGGCAATAACGTCCTTATCTATCGTAATCCTCATCTTTCTAAGCTATCCACAGTAGT 

421 + + + + + + 480 

TGTTCCGTTATTGCAGGAATAGATAGCATTAGGAGTAGAAAGATTCGATAGGTGTCATCA 



Af 1III 
Mae 1 1 

Alul Acelll | Alul BstEII Sthl32I 

CviJI Msel | j CviJI Maelll Aval |Hgal 

I I I I I II II 

CACAGACAGCTCCTTAAAAGAAACGTGTGAAGCTGTTGTGGTTACCCGAGTGTTCCGTCT 

481 + + + + + + 540 

GTGTCTGTCGAGGAATTTTCTTTGCACACTTCGACAACACCAATGGGCTCACAAGGCAGA 
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Fig. 16 (con't) 



Mnll 
PstI 
Fnu4HI 
CviRI | 

Tsel j | | Bbvl Eco57I 
BsaHI CviJI Sfcl Ml | Sspl| SfaNI Nlalll | 

I I I III I II I M 

TTACAGGCGTCAGCCCTCTGCAGCAGTAAATATTATTCAACCTTTACTTTCCCATGATGC 

541 + + + + + + 600 

AATGTCCGCAGTCGGGAGACGTCGTCATTTATAATAAGTTGGAAATGAAAGGGTACTACG 



Alul 
CviJI 
Mwol | 

Hpyl88IX | | Nlalll Hpyl88IX TaqI 

III I I I 

TATCGTTAGTGCTTCAGAAGCTACTCGTCATGTTATCATCTCGGATATTGCTGGTAATGT 

6 0i + + + + + + 660 

ATAGCAATCACGAAGTCTTCGATGAGCAGTACAATAGTAGAGCCTATAACGACCATTACA 



Hpyl78III 
Bfal ' 
Xbal 
Alul 
CviJI 
Fnu4HI 
Tsel I 



Cac8I 
Bfal | 
Nhel | j 
TspRI | | | 
III 



ScrFI 
BsaJI | 
EcoRII j 
AceIIl| | 
Bbvl | j j 
III ' 



Nlalll 
I 



CGATAAAGTCAGTGATTTGCTAGCAGCTCTAGATTGCCCAGGCACATCTGTGGACATGAC 

661 + + + + + + 720 

GCTATTTCAGTCACTAAACGATCGTCGAGATCTAACGGGTCCGTGTAGACACCTGTACTG 



Faul 
Sthl32I | 
Alul | j 
CviJI j j 
Fnu4HI | j j Acelll 
Tsel | III Alul 
Acil || III CviJI 
Msel Mwol | || jjj Bbvl | Xcml 

I II II III II I 

TGAATACGAAGTTAAATATGCCAATCCCGCAGCTCTTGTTAGCTACTGCCAAGATGTTCT 

721 + + + + + + 780 

ACTTATGCTTCAATTTATACGGTTAGGGCGTCGAGAACAATCGATGACGGTTCTACAAGA 
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Fig. 16 (con't) 

CviJI 
Haelll 
Sf aNI | 

Rsal Eael|| MboII ScrFI Apol 

Bcgl |GdiIl|| Fokl | Bcgl EcoRII | Tsp509I 

I I III II III I 

TGGTACTCTGGCCGAAGATGATGCTTTCCAAATGTTCATCCAACCTGGAACGAACAAAAT 

781 + + + + + + 840 

ACCATGAGACCGGCTTCTACTACGAAAGGTTTACAAGTAGGTTGGACCTTGCTTGTTTTA 



HphI 
Mmel | 
MboII | | 
II I 



Mae 1 1 
Earl | 
BsmAI | j 
BsmBI I 



CviRI Mwol 



BsmFI 
Hpyl78III 
AlwNI ' 
Alul 
CviJI 
Fnu4HI | 
Tsel| | 
II ' 



Ddel 
Acelll | 
Bbvl | j 

II I 



TTTCGTCGTCTCTTCACCACGTCTTGCAAATAAGGCAGAGCAGCTCCTGAAGTCCTTAGA 



AAAGCAGCAGAGAAGTGGTGCAGAACGTTTATTCCGTCTCGTCGAGGACTTCAGGAATCT 



Dpnl 
Sau3AI | 
Alwl 
Bfal 



BsaJI 
BstAPI 
Mwol 
Rsal | 
Seal | 
Tat I | 
CviRI 



BslI Alwl | | TatI | | | CviJI 

Eco57I | Bfal | | CviRI | jjstyl Mnll 

II I I I I I II I I 

TGTCCCAGAAATGGCACATACCCTAGATGATCCTGCAAGTACTGCCTTGGCTTTGGGAGG 

901 + + + + + + 960 

ACAGGGTCTTTACCGTGTATGGGATCTACTAGGACGTTCATGACGGAACCGAAACCCTCC 



Ddel 
Banll 
Bspl286I 
BssSI | 
Drdll | | 
Nlaiv| | CviJI | 
III II 



Rsal 
BsrGI | Alul 
TatI | CviJI 
I I I 



AACAGGAACCACGAGCCCTAAGAGTTTGCGGTTCTTTATGTACAAGCTGAAGTATCAAAA 

961 + + + + + + 1020 

TTGTCCTTGGTGCTCGGGATTCTCAAACGCCAAGAAATACATGTTCGACTTCATAGTTTT 
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Fig. 16 (con't) 

Maelll Alul 

Mnll | CviJI 

Eco57I EcoRV| j Maelll MslI 

I II I I I 

TGGAGAAGTGATTGCTAATGCCCTCCAAGATATCGGTTACAATCTATATGTAACCACAGC 

1021 + + + + + + : 

ACCTCTTCACTAACGATTACGGGAGGTTCTATAGCCAATGTTAGATATACATTGGTGTCG 



BciVI 
TspRI 

MboII Mnll | 

BstXI Msel | BsrI I j 

I II III 

TATGGACGAAGATTTCATTAACACTCTCAATAGTATCCAGTGGTTAGAGGTCAATAACTC 

1081 + + + + + + H40 

ATACCTGCTTCTAAAGTAATTGTGAGAGTTATCATAGGTCACCAATCTCCAGTTATTGAG 



Hindi 

AccI | CviJI 

BsaJI Taqlj Hael Mnll 

Hpyl88IX Styl Sail | j Haelll Msel Taal 

II III III 

CATAGTTATTATCGGAAACCAAGGGAATGTCGACAGAGTTATTGGCCTCTTAAACGGTTT 

1141 + + + + + + 1200 

GTATCAATAATAGCCTTTGGTTCCCTTACAGCTGTCTCAATAACCGGAGAATTTGCCAAA 



Ddel 

Hpyl78III Alul | 

Tsp509I Bfal| CviJI j 

Mnll TaqI Msel | Xbal | | Cj ePI || 

I I II III I II 

AGATTTACCTCCTAAACAGGTTTACATCGAAGTTTTAATTCTAGATACCAGCTTAGAGAA 

120 1 + + + + + + 1260 

TCTAAATGGAGGATTTGTCCAAATGTAGCTTCAAAATTAAGATCTATGGTCGAATCTCTT 



Bfal 
Avrll | 
BsaJI j 

ScrFI BsmFI Styl j 

BsaJI | CjePI CviJI || Alul 

EcoRIl|| CviRl| BsrDI | || HphI CviJI 

III II I I II I I 

ATCCTGGGACTTTGGAGTGCAATGGGTAGCCCTAGGTGATGAACAAAGTAAAGTAGCTTA 

1261 + + + + + + 1320 

TAGGACCCTGAAACCTCACGTTACCCATCGGGATCCACTACTTGTTTCATTTCATCGAAT 
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Fig. 16 (con't) 

BspGI Mspl 
CjePl| CviJI CjePI BsaXI Neil 

Hpyl78IIl| BsrI | BsmFI | Taal | ScrFI 

II I I I I III 

TGCTTCTGGACTATTGAATAATACTGGCATAGCCACACCTACAAAAGCAACTGTCCCTCC 

1321 + + + + + + 1380 

ACGAAGACCTGATAACTTATTATGACCGTATCGGTGTGGATGTTTTCGTTGACAGGGAGG 



Dpnl 
Sau3AI 
MboII I 
Taqlj 
Drdll 
Alwl I 



Cac8I | EcoRII | | | 

II I I I I 



Muni 
Tsp509I 
ScrFI 
BslI I 



ScrFI || || EcoRII | | 

Mnll BslI HI || EcoNI I j j 

Sthl32I PflMljj I j I Mnll j j j I Hpyl88IX 

Earl | j j j j Ddel | 

I I I I I I I I 

CGGCACGCCAAATCCTGGTTCGATCCCTCTTCCTACGCCAGGACAATTGACAGGGTTCTC 

1381 + + + + + + 1440 

GCCGTGCGGTTTAGGACCAAGCTAGGGAGAAGGATGCGGTCCTGTTAACTGTCCCAAGAG 



Hpyl88IX 
Hinfl | 

BseMII Earl Tfil j 

MboII |TaqII |BsmI Bfal | j Ddel 

I I I I I II I I 

AGATATGCTGAACTCTTCGTCAGCATTCGGTCTAGGAATCATCGGAAATGTCCTAAGTCA 

1441 + + + + + + 1500 

TCTATACGACTTGAGAAGCAGTCGTAAGCCAGATCCTTAGTAGCCTTTACAGGATTCAGT 



Hpyl78III 
Dpnl | 

BslI Sau3AI | | 

XmnI Mnll | CviJI Msel Ddel | j |BccI 

I II II I I I I I 

TAAAGGGAAGTCTTTCCTTACTTTGGGAGGCTTATTAAGTGCCTTAGATCAAGATGGAGA 

1501 + + + + + + 1560 

ATTTCCCTTCAGAAAGGAATGAAACCCTCCGAATAATTCACGGAATCTAGTTCTACCTCT 
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Fig. 16 (con't) 



Hpyl78III 
BpulOI 
Ddel 
Cvi JI | 
BciVI | 



Taal Hinfl Hinfl 

BsaXI | Tfil Tfil 

Hin4l | Hpyl78III | Bfal | NlaIIl|| 

II I I I I III 

TACTGTCATTGTCTTGAATCCTAGAATCATGGCTCAGGATACGCAACAAGCTTCGTTTTT 

1561 + + + + + + 1620 

ATGACAGTAACAGAACTTAGGATCTTAGTACCGAGTCCTATGCGTTGTTCGAAGCAAAAA 



Alul 
CviJI 
Hindi I I | 
BseMIl| | 
II 



BsmFI 

I 



NlalV 

Avail | Maelll 

Sau96I j Taal 

Taal | AlwNI | 

I I I I 

TGTAGGGCAAACGGTCCCTTACCAAACTATCAAATACTATATCCAAGAAACAGGAACTGT 

1621 + + + + + + 168° 

ACATCCCGTTTGCCAGGGAATGGTTTGATAGTTTATGATATAGGTTCTTTGTCCTTGACA 



Mnll 
BsaXI | 

Clal Taal | j 

TaqI MboII Maelll Sfcl | | | 

I I I I I II 

AACGCAAAATATCGATTATGAAGATATTGGAGTGAACCTTGTCGTTACCTCTACAGTTGC 

1681 + + + + + + 1740 

TTGCGTTTTATAGCTAATACTTCTATAACCTCACTTGGAACAGCAATGGAGATGTCAACG 



Hgal 
Tsp509I | 
Hpyl88IX | j 

Ddel | j j Thai 
Dpnl | | | | Acil | 

Maelll TaqI Sau3AI | | j j | BseMII j 

I I I I I I II II 

TCCCAACAATGTAGTTACACTACAAATCGAACAGACGATCTCAGAATTACATTCCGCGTC 

1741 + + + + + + 1800 

AGGGTTGTTACATCAATGTGATGTTTAGCTTGTCTGCTAGAGTCTTAATGTAAGGCGCAG 



CviJI Hpyl78III 

Fnu4HI | Apol | 

CviRl| j Tsp509I j 

Tselj | Bbvl | j 

III II I 

TGGATCACTAACACCTGTCACAGATAAAACTTATGCAGCCACACGCTTACAAATTCCCGA 

1801 + + + + + + I860 

ACCTAGTGATTGTGGACAGTGTCTATTTTGAATACGTCGGTGTGCGAATGTTTAAGGGCT 



Dpnl Maelll 

Sau3AI I Tsp45I 

Hpyl78IIl| | Alwl | 

III I I 
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Fig. 16 (con't) 

Sthl32I 

Taal| Ddel Hpyl88IX Hpyl78III 

II I I I 

CGGTTGTTTCTTAGTTATGAGTGGGCATATCAGAGATAAAACTACAAAAGTGGTTTCAGG 

1861 + + + + + + 1920 

GCCAACAAAGAATCAATACTCACCCGTATAGTCTCTATTTTGATGTTTTCACCAAAGTCC 



Bcefl TaqI 

Tsp509I | Bed | 

Msel| | Rsal | j 

vsplj I CviJI | I I 

III I I I I 

AGTGCCTTTGCTAAACTCCATACCATTAATTCGTGGTTTATTTAGCCGTACCATCGACCA 

1921 + + + + + + 1980 

TCACGGAAACGATTTGAGGTATGGTAATTAAGCACCAAATAAATCGGCATGGTAGCTGGT 



Bsu36I 

Nlalll Ddel HphI 

Hpyl7 8III | CviJI | Alul| 

Real | I Msel | j CviJI I 

III III M 

AAGGCAAAAACGCAATATCATGATGTTTATTAAGCCTAAGGTGATTAGTAGCTTTGAAGA 

1981 + + + + + + 2040 

TTCCGTTTTTGCGTTATAGTACTACAAATAATTCGGATTCCACTAATCATCGAAACTTCT 



Maelll 
MboII | 
BssSI | j 
I I I 



Muni 

BciVI Tsp509I 

I I 
AGGCACTCGTGTTACCAATAAGGAAGGATACAGATACAATTGGGAAGCTGATGAAGGATC 

2041 + + + + + + 2100 

TCCGTGAGCACAATGGTTATTCCTTCCTATGTCTATGTTAACCCTTCGACTACTTCCTAG 



Dpnl 
NlalV 
BamHI 
BstYI 
Sau3AI 
HaelV | 
Hin4I | 
Alwl | | 
Alul | j j 
CviJI j | j 
I I 
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Fig. 16 (con't) 



NlalV 
CviJI 
Haelll 
Sau96I 
BstXI 
MslI 



Alwl | 
CviRI j 
Nlalll j 
II 



Hpyl78III 
Mnll | 
Nlalll| j 
II I 



Avail 
ECOO109I 
Psp5II 
Sau96I 
Sse8647I 
Bsml | 
BsaJI | | 
Styl | | 
II I 



CviJI 
Mnll | 

I I 



CATGCAAGTGGCCCCTCGCCATGCTCCTGAATGCCAAGGACCTCCTTCTTTACAGGCTGA 

2101 + + + + + + 2160 

GTACGTTCACCGGGGAGCGGTACGAGGACTTACGGTTCCTGGAGGAAGAAATGTCCGACT 



Ddel 

Maelll Dral Alul | BseMII 

Tsp45I Msel| CviJI j Mnll | Beef I 

I II II M I 

AAGTGACTTTAAAATAATAGAAATAGAAGCTCAGTAGTGGTATATAAAAGAGGAAGATGA 

2161 + + + + + + 2220 

TTCACTGAAATTTTATTATCTTTATCTTCGAGTCATCACCATATATTTTCTCCTTCTACT 



BsaJI 
BstDSI 
Ecil | 
Acil| j 
MboII | | j 
I II I 



Hinf I 
Hpyl88IX | 
Alul | j 
CviJI j j 
Plel j j 
XmnI | j | 
II 



Bsml 
CviRI 



CviJI Bcefl 
I I 



TATTCTCCGCCGTGGAATAGCTTCTGACTCTGTTGCATTCAGGGGGAAAGCCAAGAAGAT 

2221 + + + + + + 2280 

ATAAGAGGCGGCACCTTATCGAAGACTGAGACAACGTAAGTCCCCCTTTCGGTTCTTCTA 



Plel 
BsiEI | 
CviJI | 
Haelll | 
Eael 
EagI 
Gdill 
MboII | 
Hinf I | j 
Haeiv | j | 
Hin4I | j | 

I I II 
GTAGAGTCGGCCGTATAACT 

2281 + + 2300 

CATCTCAGCCGGCATATTGA 
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Figure 17; CPN1 00557 



tagcttgaaa tagcttcctc caattgtgat ttctgaagaa gtataggggg aaatgtcgaa 60 



gagatagtct tgttttaaag gaggagggga aaacggttta atg age aga aaa gac 

Met Ser Arg Lys Asp 
Arg Lys Asp 



aat gag gtt tec tta get cgt tea att ttt aat ata tta tec gga act 163 

Asn Glu Val Ser Leu Ala Arg Ser lie Phe Asn lie Leu Ser Gly Thr 

Asn Glu Val Ser Leu Ala Arg Ser lie Phe Asn lie Leu Ser Gly Thr 

10 15 20 



ttc tgt agt cgt att 
Phe Cys Ser Arg lie 
Phe Cys Ser Arg lie 
25 



aca ggg ata ttt cga gaa 
Thr Gly He Phe Arg Glu 
Thr Gly He Phe Arg Glu 
30 



att gca atg gca acc 211 
He Ala Met Ala Thr 
He Ala Met Ala Thr 
35 



tat ttt gga get gat cca att gta get get ttc tgg tta ggt ttc cgt 259 
Tyr Phe Gly Ala Asp Pro He Val Ala Ala Phe Trp Leu Gly Phe Arg 
Tyr Phe Gly Ala Asp Pro He Val Ala Ala Phe Trp Leu Gly Phe Arg 
40 45 50 



act gtt ttt ttc tta 
Thr Val Phe Phe Leu 
Thr Val Phe Phe Leu 
55 



aga aaa att tta gga ggg 
Arg Lys He Leu Gly Gly 
Arg Lys He Leu Gly Gly 
60 



etc att eta gaa caa 307 
Leu He Leu Glu Gin 
Leu He Leu Glu Gin 
65 



gec ttc ate cct cat ttt gaa ttt etc cgt get caa agt etc gat cgt 
. Phe He Pro His Phe Glu Phe Leu Arg Ala Gin Ser Leu Asp Arg 
. Phe He Pro His Phe Glu Phe Leu Arg Ala Gin Ser Leu Asp Arg 



70 



75 



80 



85 



gcg gcg ttt ttt ttc cga cgc ttt tct aga ttg att aaa ggc age act 
Ala Ala Phe Phe Phe Arg Arg Phe Ser Arg Leu He Lys Gly Ser Thr 
Ala Ala Phe Phe Phe Arg Arg Phe Ser Arg Leu He Lys Gly Ser Thr 
90 95 100 



att ata ttc act ctg ctt att gaa gca gta ttg tgg gta ttc ttc aat 451 
He He Phe Thr Leu Leu He Glu Ala Val Leu Trp Val Phe Phe Asn 
He He Phe Thr Leu Leu He Glu Ala Val Leu Trp Val Phe Phe Asn 
105 110 115 



aac gtt gaa gag ggg 
Asn Val Glu Glu Gly 
Asn Val Glu Glu Gly 
120 



act tac gat atg att etc 
Thr Tyr Asp Met He Leu 
Thr Tyr Asp Met He Leu 
125 



ctt act atg ata etc 499 
Leu Thr Met He Leu 
Leu Thr Met He Leu 
130 



ttg ccc tgt ggc att 
Leu Pro Cys Gly He 
Leu Pro Cys Gly He 
135 



ttc tta atg atg tac aat 
Phe Leu Met Met Tyr Asn 
Phe Leu Met Met Tyr Asn 
140 



gta aac ggc get ttg 547 
Val Asn Gly Ala Leu 
Val Asn Gly Ala Leu 
145 



ctt cac tgt gga aat aag ttt ttc ggg gtg gga tta get ccc gta gtt 
Leu His Cys Gly Asn Lys Phe Phe Gly Val Gly Leu Ala Pro Val Val 
Leu His Cys Gly Asn Lys Phe Phe Gly Val Gly Leu Ala Pro Val Val 
150 155 160 165 



gta aat ate att tgg 
Val Asn He He Trp 
Val Asn He He Trp 
170 



att ttc ttt gtt ata gcg 
He Phe Phe Val He Ala 
He Phe Phe Val He Ala 
175 
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Fig. 17 (con't) 
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His 
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He 
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Gly 
Gly 
260 
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Pro 
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Leu 
Leu 
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Fig. 17 (con't) 
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tct aag tta ctt tgg gag age ate egg cgt tec ata aaa gtt atg gga 1459 

Ser Lys Leu Leu Trp Glu Ser He Arg Arg Ser He Lys Val Met Gly 

Ser Lys Leu Leu Trp Glu Ser He Arg Arg Ser He Lys Val Met Gly 

440 445 450 
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ccc tta tec tec ata acg get caa gca att get ttt tta tct gag age 1603 

Pro Leu Ser Ser He Thr Ala Gin Ala He Ala Phe Leu Ser Glu Ser 

Pro Leu Ser Ser He Thr Ala Gin Ala He Ala Phe Leu Ser Glu Ser 
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540 




545 



taatcatgtt tgtttcttgt agctcagtcg ctttctttta gctttaagtt ttgatagect 1801 
gcttggtctt ctgtttctac acttaatatt gatactaagg atactatgaa aaaacaggta 1861 



tatcaatggt tagcgagtgt ggttctttta gegctgaca 
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Figure 18(RY-43) 

Restriction enzyme analysis of CPN1 00557 



TaqI 

Alul Alul Muni Hpyl88IX Eco57I | 

CviJI CviJI Tsp509I Mnll | MboII Earl | j 

I I I I I I II I 

TAGCTTGAAATAGCTTCCTCCAATTGTGATTTCTGAAGAAGTATAGGGGGAAATGTCGAA 

! + + + + + + i 

ATCGAACTTTATCGAAGGAGGTTAACACTAAAGACTTCTTCATATCCCCCTTTACAGCTT 



Mnll 
Dral 
Msel | 
Mnll | | 
MboII | | | 
I III 



Msel 
BseRI | 
Taal | j 
II I 



Mnll 
I 



GAGATAGTCTTGTTTTAAAGGAGGAGGGGAAAACGGTTTAATGAGCAGAAAAGACAATGA 
CTCTATCAGAACAAAATTTCCTCCTCCCCTTTTGCCAAATTACTCGTCTTTTCTGTTACT 



Hpyl78III 
Mspl | 
BsaWI | j 

BspEI | | Sfcl 

. . IN I 

GGTTTCCTTAGCTCGTTCAATTTTTAATATATTATCCGGAACTTTCTGTAGTCGTATTAC 

12 i + + + + + + 180 

CCAAAGGAATCGAGCAAGTTAAAAATTATATAATAGGCCTTGAAAGACATCAGCATAATG 



Alul 
CviJI 
BpulOI | 

Ddel |Tsp509l Msel 
II II 



Muni 
Tsp509I 
Dpnl 
Bbvl 



Tsp509I 
Hpyl78III |CviRI Acelll 
TaqI | j Cj el | BsrDI 

II I 



Sau3AI 
Alul 
CviJI 
Alwl | 

I I 



Fnu4HI 
Alul | 
CviJI j 
Tselj 
Cjel | | 
I II 



AGGGATATTTCGAGAAATTGCAATGGCAACCTATTTTGGAGCTGATCCAATTGTAGCTGC 

181 + + + + + + 240 

TCCCTATAAAGCTCTTTAACGTTACCGTTGGATAAAACCTCGACTAGGTTAACATCGACG 



Taal 
Rsal | 
I 



Msel Mnll 
Aflll| Apol | 
Smll|Tsp509I j 
II I I 



Hpyl78III 
Banll Bfal | 
Bspl286I Foklj 
CviJI |XbaI | j 
I I III 



241 + + + + + + 300 

AAAGACCAATCCAAAGGCATGACAAAAAAAGAATTCTTTTTAAAATCCTCCCGAGTAAGA 
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Fig. 18 (con't) 



CviJI 

I 



Mnll 
Apol | 
Tsp509I j 
II 



Fnu4HI 
BsiEI 
Pvul 
Dpnl 
BsmAI 



Sau3AI 
Hpyl78III | 
BsiHKAI | j 

Bspl286I Taqlj 
I III 



I 

Taul 
Acil | 
'I 



AGAACAAGCCTTCATCCCTCATTTTGAATTTCTCCGTGCTCAAAGTCTCGATCGTGCGGC 

301 + + + + + + 360 

TCTTGTTCGGAAGTAGGGAGTAAAACTTAAAGAGGCACGAGTTTCAGAGCTAGCACGCCG 



Hpyl8 8IX 
I 



yl78III Fnu4HI 

Bf al | Tsel| 

Hgal | Mmel | | 

Xbal| | Msel | j | 

III I I II 



CAAAAAAAAGGCTGCGAAAAGATCTAACTAATTTCCGTCGTGATAATATAAGTGAGACGA 



MboII 
I 



Mnll 
Acll | 
Earl j 
Maell j 
II 



BsmFI 
Hinf I 
MboII Tfil 
I I 



TATTGAAGCAGTATTGTGGGTATTCTTCAATAACGTTGAAGAGGGGACTTACGATATGAT 



ATAACTTCGTCATAACACCCATAAGAAGTTATTGCAACTTCTCCCCTGAATGCTATACTA 

Rsal 
BsrGI I 
Msel TatI | 

I I I 

TCTCCTTACTATGATACTCTTGCCCTGTGGCATTTTCTTAATGATGTACAATGTAAACGG 

481 + + + + + + 540 

AGAGGAATGATACTATGAGAACGGGACACCGTAAAAGAATTACTACATGTTACATTTGCC 



Sthl32I 
BscGI | 
Alul | I 
CviJI j j 
I I I 

CGCTTTGCTTCACTGTGGAAATAAGTTTTTCGGGGTGGGATTAGCTCCCGTAGTTGTAAA 

541 + + + + + + 600 

GCGAAACGAAGTGACACCTTTATTCAAAAAGCCCCACCCTAATCGAGGGCATCAACATTT 



Haell 
Hhal | 

II 



Sthl32I 
TspRI 
Taal | 
Beef 1 1 j 
II I 
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Fig. 18 (con't) 



CviJI 
Fnu4HI | 
Taul | 
Acil| | 
II I 



Bfal 
Dpnl 
BstYI 
Sau3AI 
Hpyl8 8IX| 
Alwl | j 
I II 



TATCATTTGGATTTTCTTTGTTATAGCGGCTCGTCATTCAGATCCTAGAGAGCGTATTAT 

601 + + + + + + 660 

ATAGTAAACCTAAAAGAAACAATATCGCCGAGCAGTAAGTCTAGGATCTCTCGCATAATA 

Bfal 

Sthl32l| ScrFI 

CviJI || Cjel EcoRII | 

BsaJI | | j NspV | NlaIV| j 

BstDSI j | j MboII TaqI |MseI Taal | j j 

I I II I III I II I 

CGGTTTATCCGTGGCTCTAGTTATCGGGTTTTTCTTCGAATGGTTAATCACGGTTCCTGG 

661 + + + + + + 720 

GCCAAATAGGCACCGAGATCAATAGCCCAAAAAGAAGCTTACCAATTAGTGCCAAGGACC 



Bce83I 
Earl I 
Sapl j 

Apol Bpml | | 

Tsp509I Cjel | j j 
I I I II 



BplI Alul 
Ms II | CviJI 
Mnll | | TaqI | 
I I I I I 



Hpyl78III 
MboII 

Banll 
Bspl286I 
Hin4I 
CviJI |Smll| 
I I II 

AGTATGGAAATTTCTATTAGAAGCGAAGAGCCCACCTCAAGAACACGATAGTGTTCGAGC 

721 + + + + + + 780 

TCATACCTTTAAAGATAATCTTCGCTTCTCGGGTGGAGTTCTTGTGCTATCACAAGCTCG 

Tthlllll 

Alul 
CviJI 

Alul MspAlI 
CviJI MboII PvuII 

Mwol | Msel | Sf aNI | 

I I I I II 

TTTATTAGCTCCCTTATCTTTGGGTATTTTAACTTCAAGCATCTTCCAGCTGAACCTTCT 

781 + + + + + + 840 

AAATAATCGAGGGAATAGAAACCCATAAAATTGAAGTTCGTAGAAGGTCGACTTGGAAGA 



ECORV 
Hpyl8 8IX | 
Tthlllll | j 
I I 



CviJI 
Haelll 
EcoO109I 
Nlalll | 
Cac8I Rsal | | 
Mwol | BsrGI | | | 
CviJI | j TatI | | Sau96I 
III III I 



Rsal 
TatI | 
Mnll | j 
I I I 



TTCTGATATCTGCTTGGCTCGCTATGTACATGAAATAGGCCCTCTATATCTTATGTACTC 
AAGACTATAGACGAACCGAGCGATACATGTACTTTATCCGGGAGATATAGAATACATGAG 
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Fig. 18 (con't) 



Alul 

Msel CviJI Acelll CviJI BseRI Taal 

I I I I I I 

CTTAAAGATTTATCAGCTCCCCATACATCTCTTTGGCTTTGGTGTGTTTACCGTTCTCCT 

901 + + + + + + 960 

GAATTTCTAAATAGTCGAGGGGTATGTAGAGAAACCGAAACCACACAAATGGCAAGAGGA 



MboII 
Nlalll 
Hpyl78III 
Real 

Rsal Dpnl 
Mnll BsrGI | Mnll 

Tsp509I I TatI j Sau3AI | 

II II M 

CCCAGCAATTTCTCGTTGTGTACAGCGAGAAGATCATGAGAGGGGATTGAAACTTATGAA 

961 + + + + + + 1020 

GGGTCGTTAAAGAGCAACACATGTCGCTCTTCTAGTACTCTCCCCTAACTTTGAATACTT 



Dpnl 
Bell | 

HphI Nlalll Sau3AI j CviJI Ddel 

I I I I I I 

GTTCGTTCTCACCCTAACCATGTCCGTAATGATCATTATGACAGCAGGGCTATTGCTCTT 

.021 + + + + + + : 

CAAGCAAGAGTGGGATTGGTACAGGCATTACTAGTAATACTGTCGTCCCGATAACGAGAA 



BsaXI 
Hin4I 
Hinfl 
ScrFI | 
ECORII | 
Alul | | 

CviJI | j 

I 



Hpyl8 8IX 

Pie I Bpml Ddel | 

I I I I 

AGCTTTACCTGGAGTCCGTGTCCTTTATGAACACGGACTTTTCCCTCAGAGTGCTGTCTA 

1081 + + + + + + H40 

TCGAAATGGACCTCAGGCACAGGAAATACTTGTGCCTGAAAAGGGAGTCTCACGACAGAT 



BseMII 
Mwol 
AccI | 
Mnll | j 

' I 



BsrI 
NlalV | 
BanI | | 
I I 



BsaJl 
Styl 
CviJI | 
Haelj NlalV 
HaeIIl|CviJl| 
II II 



CGCTATTGTTCGTGTATTGCGAGGTTATGGTGCCAGTATTATCCCTATGGCCTTGGCTCC 



GCGATAACAAGCACATAACGCTCCAATACCACGGTCATAATAGGGATACCGGAACCGAGG 
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Fig. 18 (con't) 

Fnu4HI 
Taul 
Acil | 

MspAlI | BsrBI Hinfl 

BsmAI CviRl | | Acil | Tfil 

I III M I 

TTTAGTCTCTGTTCTTTTTTATGCACAGCGGCAGTATGCTGTTCCGCTCTTTATAGGAAT 
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Fig. 18 (con't) 
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Fig. 18 (con't) 
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Figure 19: CPN 100622' 

tctcaagagt aaccttatcc ttagattatt cagctcaagt ctcctcgtca actgtaggtc 60 

aataccttaa agctgagagt cattgcacat tttaaccaca atg aaa aca tea agg 115 

Met Lys Thr Ser Arg 
1 5 

aat aaa cag tgc aaa ata aca gat ccc tta agt aaa tct tec ttc ttt 163 
Asn Lys Gin Cys Lys He Thr Asp Pro Leu Ser Lys Ser Ser Phe Phe 
!° - 15 20 

gtt gga gec tta att tta ggt aaa act aca ata etc ctt aat gcg act 211 
Val Gly Ala Leu He Leu Gly Lys Thr Thr He Leu Leu Asn Ala Thr 
25 30 35 

ccg ttg tct gac tat ttt gat aat caa gca aat caa etc aca aca etc 259 
Pro Leu Ser Asp Tyr Phe Asp Asn Gin Ala Asn Gin Leu Thr Thr Leu 
4 ° 45 so 

ttc cct eta att gat act ctt act aac atg act ccc tac tct cat aga 307 
Phe Pro Leu He Asp Thr Leu Thr Asn Met Thr Pro Tyr Ser His Arg 
55 60 6 5 

gca aca ctt ttt gga gtt agg gat gac act aac caa gac att gtc etc 355 
Ala Thr Leu Phe Gly Val Arg Asp Asp Thr Asn Gin Asd He Val Leu 
70 75 so " 85 

gat cac cag aat tec ata gaa age tgg ttc gaa aac ttc tct caa gac 403 
Asp His Gin Asn Ser He Glu Ser Trp Phe Glu Asn Phe Ser Gin Asp 
9° 95 ioo 

ggc ggt get etc tct tgc aaa tea ctt gee ata acg aat aca aaa aac 451 
Gly Gly Ala Leu Ser Cys Lys Ser Leu Ala He Thr Asn Thr Lys Asn 
105 no 115 

caa att ctt ttc eta aat age ttt get att aaa aga get ggt gcg atg 499 
Gin He Leu Phe Leu Asn Ser Phe Ala He Lys Arg Ala Gly Ala Met 
120 125 130 

tat gtt gat ggt aat ttc gat ctt tct gag aat cat ggt tec ate att 547 
Tyr Val Asp Gly Asn Phe Asp Leu Ser Glu Asn His Gly Ser He He 
135 140 145 

ttc tct ggg aat tta age ttt cct aat gca agt aat ttc get gat act 595 
Phe Ser Gly Asn Leu Ser Phe Pro Asn Ala Ser Asn Phe Ala Asp Thr 
150 155 160 165 

Cvs Th~ 111 1?J ttt r tta c Cg aat g " aca atC tca aaa 643 

C/S ™- Gly Gly Axa Val Leu Cys Ser - Lys Asn Val Thr He Ser Lys 

Thr Gly Gly Ala Val Leu Cys Ser Lvs Asn Val Thr lie Se- Lvs 

170 175 180 * 

aat caa gga acc gca tac ttc att aac aac aag gca aaa tct tca aaa 691 

Asn Gin Gly Thr Ala Tyr Phe He Asn Asn Lvs lla Lys Ser Ser Gly 

Asn Gin Gly Thr Ala Tyr Phe He Asn Asn Lys Ala Lys Ser Se^ Gl', 

185 190 195 1 
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Fig. 19 (con't) 

gga gca ate caa get gca ate ata aac att aag gac aac act ggc cct 739 
Gly Ala He Gin Ala Ala He He Asn He Lys Asd Asn Thr Gly Pro 
Gly Ala He Gin Ala Ala He Ha A^n lie Lys Asp Asn Thr Giy Pro 

200 205 210 

tgc ctg ttt ttt aat aat get gca ggc gga aca gcg ggg ggc gcg ttg 787 
Cys Leu Phe Phe Asn Asn Ala Ala Gly Gly Thr Ala Gly Gly Ala Leu 
Cys Leu Phe Phe Asn Asn Ala Ala Gly Gly Thr Ala Gly Gly Ala Leu 
215 220 225 

ttc get aat get tgt aga att gag aat aat tct cag cct ate tat ttt 835 

Phe Ala Asn Ala Cys Arg He Glu Asn Asn Ser Gin Pro He Tyr Phe 

Phe Ala Asn Ala Cys Arg He Glu Asn Asn Ser Gin Pro He Tyr Phe 
230 235 240 245 

ttg aat aac caa tea ggt ctg ggt ggt gca ata aga gta cat caa gag 883 

Leu Asn Asn Gin Ser Gly Leu Gly Gly Ala He Arg Val His Gin Glu 

Leu Asn Asn Gin Ser Gly Leu Gly Gly Ala He Arg Val His Gin Glu 
250 255 260 

tgc att ctt aca aag aat acc ggt tct gtg ate ttc aac aat aat ttt 931 

Cys He Leu Thr Lys Asn Thr Gly Ser Val He Phe Asn Asn Asn Phe 

Cys He Leu Thr Lys Asn Thr Gly Ser Val He Phe Asn Asn Asn Phe 

265 270 275 

gec atg gaa gcg gac ate tct get aac cat tec tct gga ggg get ate 979 

Ala Met Glu Ala Asp He Ser Ala Asn His Ser Ser Gly Gly Ala He 

Ala Met Glu Ala Asp He Ser Ala Asn His Ser Ser Gly Gly Ala He 
280 285 290 

tat tgc att agt tgt tct ata aaa gac aac cca gga att gca gec ttc 1027 

Tyr Cys He Ser Cys Ser He Lys Asp Asn Pro Gly lie Ala Ala Phe 

Tyr Cys lie Ser Cys Ser lie Lys Asp Asn Pro Glv He Ala Ala Phe 

295 300 305 

gat aat aat act gca gca cga gat gga ggt get ate tgt aca caa tct 1075 

Asp Asn Asn Thr Ala Ala Arg Asp Gly Gly Ala He Cys Thr Gin Ser 

Asp Asn Asn Thr Ala Ala Arg Asd Gly Gly Ala He Cys Thr Gin Ser 
310 315 320 325 

eta act ata caa gac agt ggt ccc gtc tat ttc aca aac aat cag gga 1123 

Leu Thr lie Gin Asp Ser Gly Pro Val Tyr Phe Thr Asn Asn Gin Gly 

Leu Thr lie Gin Asd Ser Gly Pro Val Tyr Phe Thr Asn Asn Gin Gly 
330 335 340 

act tgg ggc ggc get ate atg etc cgt caa gat ggt gca tgc act tta 1171 

Thr Trp Gly Gly Ala lie Met Leu Arg Gin Asp Gly Ala Cys Thr Leu 

Thr TrD Gly Gly Ala lie Met Leu Arg Gin Asp Gly Ala Cys Thr Leu 

345 350 355 

ttt get gat cag gga gat att att ttt tat aat aat aga cac ttc aaa 1219 

Phe Ala Asp Gin Gly Asp lie lie Phe Tyr Asn Asn Arg His Phe Lys 

Phe Ala Asp Gin Gly Asp lie lie Phe Tyr Asn Asn Arg His Phe Lys 
360 365 370 

gat act ttc age aat cat gtt tct gta aac tgc acg cgt aat gtc tea 1267 

Asp Thr Phe Ser Asn His Val Ser Val Asn Cys Thr Arg Asn Val Ser 

Asd Thr Phe Ser Asn His Val Ser Val Asn Cys Thr Arg Asn Val Ser 

375 380 385 
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Fig. 19 (con't) 

tta aca gtt gga gca agt caa ggt cat tct get acc ttc tat gat ccc 

Leu Thr Val Gly Ala Ser Gin Gly His Ser Ala Thr Phe Tyr Asp P-o 

Leu Thr Val Gly Ala Ser Gin Gly His Ser Ala Thr Phe Tyr Asd P-o 

390 395 400 ' 4 05 

ata eta caa aga tat act ata caa aac tct ate caa aaa ttt aat cct 

He Leu Gin Arg Tyr Thr He Gin Asn Ser He Gin Lys Phe Asn P-o 

He Leu Gin Arg Tyr Thr He Gin Asn Ser He Gin Lys Phe Asn P-o 

410 415 420 

aat cca gaa cac etc gga act ate ttg ttc tec tea aca tat att ccg 
Asn Pro Glu His Leu Gly Thr He Leu Phe Ser Ser Thr Tyr He Pro 
Asn Pro Glu His Leu Gly Thr He Leu Phe Ser Ser Thr Tyr He Pro 
425 430 435 

gat aca teg act tct cgt gat gac ttc att tea cat ttc aga aac cac 

Asp Thr Ser Thr Ser Arg Asp Asp Phe He Ser His Phe Arg Asn His 

Asp Thr Ser Thr Ser Arg Asp Asp Phe He Ser His Phe Arg Asn His 
440 445 450 

att gga ctg tac aac ggc aca etc get ctt gaa gat cga gca gag tgg 

He Gly Leu Tyr Asn Gly Thr Leu Ala Leu Glu Asp Arg Ala Glu Trp 

He Gly Leu Tyr Asn Gly Thr Leu Ala Leu Glu Asp Arg Ala Glu Trp 

455 460 465 

aaa gtc tat aaa ttt gat caa ttt ggt ggg act eta egg tta ggc agt 

Lys Val Tyr Lys Phe Asp Gin Phe Gly Gly Thr Leu Arg Leu Gly Ser 

Lys Val Tyr Lys Phe Asp Gin Phe Gly Gly Thr Leu Arg Leu Gly Ser 

470 475 480 485 

aga get gtg ttt tct aca aca gac gaa gaa caa agt age agt agt gtg 

Arg Ala Val Phe Ser Thr Thr Asp Glu Glu Gin Ser Ser Ser Ser Val 

Arg Ala Val Phe Ser Thr Thr Asp Glu Glu Gin Ser Ser Ser Ser Val 

490 495 500 

ggt tct gta att aac ate aat aat ctt gca att aac ctt ccc tct ate 

Gly Ser Val He Asn He Asn Asn Leu Ala He Asn Leu Pro Ser He 

Gly Ser Val He Asn He Asn Asn Leu Ala He Asn Leu Pro Ser He 

505 510 515 

tta ggc aac aga gtt get ccc aag eta tgg att cgc ccc aca ggt tea 

Leu Gly Asn Arg Val Ala Pro Lys Leu Trp lie Arg Pro Thr Gly Ser 

Leu Gly Asn Arg Val Ala Pro Lys Leu Trp He Arg Pro Thr Gly Ser 

520 525 530 

tea gca ccc tat age gaa gat aat aac cct ata ate aat etc tea gga 

Ser Ala Pro Tyr Ser Glu Asp Asn Asn Pro He He Asn Leu Ser Gly 

Ser Ala Pro Tyr Ser Glu Asp Asn Asn Pro He He Asn Leu Ser Gly 
535 540 545 

cct ttg age eta ctg gat gac gag aac eta gat ccc tat gat act gca 

Pro Leu Ser Leu Leu Asp Asp Giu Asn Leu Asp Pro Tyr Asp Thr Ala 

Pro Leu Ser Leu Leu Asp Asp Glu Asn Leu Asp Pro Tyr Asp Thr Ala 

550 555 560 * 565 

gac ctt gec caa cct ate gca gaa gtt cct ctt ctg tat etc tta gac 

Asp Leu Ala Gin Pro He Ala Glu Val Pro Leu Leu Tyr Leu Leu Asp 

Asp Leu Ala Gin Pro He Ala Glu Val Pro Leu Leu Tyr Leu Leu Asd 

570 575 580 
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acg acc ttt aag ttt taaaagcatg ttatatagac aatgeaaect gtaaagacca 3C02 
Thr Thr Phe Lys Phe 
Thr Thr Phe Lys Phe 
950 

aatagagagt agtgaacact ctctaccatc atgaatctta tgggagaagc taagggaaat 3062 
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Fig. 19 (con't) 
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Ser Asn 
Ser Asn 
720 


cat 
His 
His 


age 
Ser 
Ser 


ttt 
Phe 
Phe 


ggt 
Gly 
Gly 
725 


2275 


gta 
Val 
Val 


aac 
Asn 
Asn 


ttc 
Phe 
Phe 


tec 
Ser 
Ser 


caa 
Gin 
Gin 
730 


ctt 
Leu 
Leu 


ttc 
Phe 
Phe 


agt 
Ser 
Ser 


aat 
Asn 
Asn 


etc 
Leu 
Leu 
735 


tac gag 
Tyr Glu 
Tyr Glu 


age 
Ser 
Ser 


cac 
His 
His 


tec 
Ser 
Ser 
740 


gac 
Asp 
Asp 


2323 


aat 
Asn 
Asn 


tec 
Ser 
Ser 


gtg 
Val 
Val 


get 
Ala 
Ala 
745 


teg 
Ser 
Ser 


cat 
His 
His 


acg 
Thr 
Thr 


aca 
Thr 
Thr 


act 
Thr 
Thr 
750 


gta 
Val 
Val 


gcg etc 
Ala Leu 
Ala Leu 


cag 
Gin 
Gin 


ate 
He 
lie 
755 


aat 
Asn 
Asn 


aat 
Asn 
Asn 


2371 


cct 
Pro 
Pro 


tgg 
Trp 
Trp 


ctg 
Leu 
Leu 
7 60 


caa 
Gin 
Gin 


gag 
Glu 
Glu 


aga 
Arg 
Arg 


ttc 
Phe 
Phe 


tct 
Ser 
Ser 
765 


aca 
Thr 
Thr 


tct 
Ser 
Ser 


gca tct 
Ala Ser 
Ala Ser 


eta 
Leu 
Leu 
770 


gee 
Ala 
Ala 


tac 
Tyr 
Tyr 


age 
Ser 
Ser 


2419 
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Figure 20 (RY-44) 

Restriction enzyme analysis of CPN 100622 



Mnll 
Taal 



Maelll 
Hpyl78III | 
Smll | | 
I I 



Ddel 
Bce83I | 
II 



Smll 
Alul | 
Cvi JI j 
BseRI | | 
III 



Hindi 
BsmAI | 
AceIIl| j 

I I I 



TCTCAAGAGTAACCTTATCCTTAGATTATTCAGCTCAAGTCTCCTCGTCAACTGTAGGTC 

1 + + + + + + < 

AGAGTTCTCATTGGAATAGGAATCTAATAAGTCGAGTTCAGAGGAGCAGTTGACATCCAG 

BsrDI 
Hinfl 
Ddel | 
Msel Alul | j CviRI 
BseMII |CviJl| j Plel Msel 

I I II I I I I 

AATACCTTAAAGCTGAGAGTCATTGCACATTTTAACCACAATGAAAACATCAAGGAATAA 

61 + + + + + + : 

TTATGGAATTTCGACTCTCAGTAACGTGTAAAATTGGTGTTACTTTTGTAGTTCCTTATT 



Mmel 
MboII 
Msel | 
Aflll| | 



Dpnl 
BstYI 
Sau3AI 
Alwl 
TspRI | 
CviRI | j 
Taal | | | 
I I I I 



Tsp509I 
Msel | 
CviJI | | 
NlaIV| j | 
II II 



ACAGTGCAAAATAACAGATCCCTTAAGTAAATCTTCCTTCTTTGTTGGAGCCTTAATTTT 

121 + + + + + + 180 

TGTCACGTTTTATTGTCTAGGGAATTCATTTAGAAGGAAGAAACAACCTCGGAATTAAAA 



Plel Hpyl88IX 
Msel | Hinfl DrdI | 

II I I I 

AGGTAAAACTACAATACTCCTTAATGCGACTCCGTTGTCTGACTATTTTGATAATCAAGC 

181 + + + + + + 240 

TCCATTTTGATGTTATGAGGAATTACGCTGAGGCAACAGACTGATAAAACTATTAGTTCG 

Bsbl Hinfl 
Tthlllll | Tsp509I Nlalll 

MboII | | Earl | Mnll Plel | 

III III II 

AAATCAACTCACAACACTCTTCCCTCTAATTGATACTCTTACTAACATGACTCCCTACTC 

241 + + + + + + 300 

TTTAGTTGAGTGTTGTGAGAAGGGAGATTAACTATGAGAATGATTGTACTGAGGGATGAG 
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Fig. 20 (con't) 



Dpnl 
Sau3AI 
HphI 
Tthllll | 

Cjel Bsbl Cjel Fokl | j TaqI 

I I I I I I I 

TCATAGAGCAACACTTTTTGGAGTTAGGGATGACACTAACCAAGACATTGTCCTCGATCA 

301 + + + + + + 360 

AGTATCTCGTTGTGAAAAACCTCAATCCCTACTGTGATTGGTTCTGTAACAGGAGCTAGT 



Apol 
EcoRI 
Tsp509I 
Mnll | 



NspV 
TaqI 
Drdll | 
Bce83l| | 
Alul | | | 
CviJI j j | 
I II 



Hpyl78III 
Smll | 

I I 



BsiHKAI 
Bspl286I 
Acil | : 

I I 



CCAGAATTCCATAGAAAGCTGGTTCGAAAACTTCTCTCAAGACGGCGGTGCTCTCTCTTG 

361 + + __. + + + + 420 

GGTCTTAAGGTATCTTTCGACCAAGCTTTTGAAGAGAGTTCTGCCGCCACGAGAGAGAAC 



Acelll 

Apol Alul | 

Tsp509I CviJI j Msel 

I I I I 

CAAATCACTTGCCATAACGAATACAAAAAACCAAATTCTTTTCCTAAATAGCTTTGCTAT 

GTTTAGTGAACGGTATTGCTTATGTTTTTTGGTTTAAGAAAAGGATTTATCGAAACGATA 



Hinfl 
Ddel | 
Hpyl88IX 
Dpnl 
Sau3AI | 
BseMIl| j | NlalV 

Alul Tsp509I Ml | Drdll | 

CviJI Bed | TaqI j j |TfiI Nlalll | j 

I I I II I I I III 

TAAAAGAGCTGGTGCGATGTATGTTGATGGTAATTTCGATCTTTCTGAGAATCATGGTTC 

481 + + + + + + 540 

ATTTTCTCGACCACGCTACATACAACTACCATTAAAGCTAGAAAGACTCTTAGTACCAAG 



Alul 
CviJI 
Hindlll 
Msel | 
Apol | | 
Tsp509I | | 
I I I 



Rsal 
BsrGI | 

Tsp509I TatI j 

Bed Tsp509I | | | CviRI | Acelll | j 

I I I I I II Ml 

CATCATTTTCTCTGGGAATTTAAGCTTTCCTAATGCAAGTAATTTCGCTGATACTTGTAC 

541 + + + + + + 600 

GTAGTAAAAGAGACCCTTAAATTCGAAAGGATTACGTTCATTAAAGCGACTATGAACATG 
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Fig. 20 (con't) 



Alul NspV Acil 

CviJI Taql Maelll NlalV | 

III II 
AGGGGGAGCTGTTTTATGTTCGAAAAATGTTACAATCTCAAAAAATCAAGGAACCGCATA 

601 + + + + + + 660 

TCCCCCTCGACAAAATACAAGCTTTTTACAATGTTAGAGTTTTTTAGTTCCTTGGCGTAT 



CviRI 
BseRI 
Fnu4HI 
Hin4I Alul | 

Bbvl | CviJI j 

Eco57I Hpyl78III | | Tselj 

Msel | MboII Mnll | j | Mwol | j | Msel 

III I I I I I II I I 

CTTCATTAACAACAAGGCAAAATCTTCAGGAGGAGCAATCCAAGCTGCAATCATAAACAT 

661 + + + + + + 720 

GAAGTAATTGTTGTTCCGTTTTAGAAGTCCTCCTCGTTAGGTTCGACGTTAGTATTTGTA 



BsrI 
TspRI 
CviJI | 
Haelll j 
Sau96I | j 
Bsbl I I I 

I III 



Ecil 
Faul 
Acil 
Sthl32I 
Cac8I 
PstI 
CviRI 
Fnu4HI | 
Sfcl j 
Tsel| j 
II I I 



Acil 
MspAlI 
I 



Mwol 
I 



Bbvl Msel 

I I 

TAAGGACAACACTGGCCCTTGCCTGTTTTTTAATAATGCTGCAGGCGGAACAGCGGGGGG 

721 + + + + + + 780 

ATTCCTGTTGTGACCGGGAACGGACAAAAAATTATTACGACGTCCGCCTTGTCGCCCCCC 



Mwol 
Tthlllll | 

Hhal | | CviJI 

Thai j | Tsp509I Tsp509I Ddel | BseMII 

III I III I 

CGCGTTGTTCGCTAATGCTTGTAGAATTGAGAATAATTCTCAGCCTATCTATTTTTTGAA 

781 + + + + + + 840 

GCGCAACAAGCGATTACGAACATCTTAACTCTTATTAAGAGTCGGATAGATAAAAAACTT 

Hpyl78III 
Rsal | BsmI 

Taqll CviRI TatI | j CviRI 

I I I I I I 

TAACCAATCAGGTCTGGGTGGTGCAATAAGAGTACATCAAGAGTGCATTCTTACAAAGAA 



ATTGGTTAGTCCAGACCCACCACGTTATTCTCATGTAGTTCTCACGTAAGAATGTTTCTT 
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Fig. 20 (con't) 



Acil 
Mwol 
Nlalll | 
BsaJI 
BstDSI 
Ncol 

Tsp509I Styl 

I I 

TACCGGTTCTGTGATCTTCAACAATAATTTTGCCATGGAAGCGGACATCTCTGCTAACCA 

901 + + + + + + 960 

ATGGCCAAGACACTAGAAGTTGTTATTAAAACGGTACCTTCGCCTGTAGAGACGATTGGT 



Dpnl 
Sau3AI 
MboII 
Mspl 
BsaWI | 
BsrFI j 
PinAI j 
II 



CviJI 
Hin4I | 

BslI | | Tsp509I 
Hpyl78III III ScrFI | Fnu4HI 

BstXI || || Bpml BsaJI | j CviRI | 

Mnll | j |Mnll| CviRI | EcoRII j j Tsel | 

I I I I II II I I I II 

TTCCTCTGGAGGGGCTATCTATTGCATTAGTTGTTCTATAAAAGACAACCCAGGAATTGC 

961 + + + + + + 1020 

AAGGAGACCTCCCCGATAGATAACGTAATCAACAAGATATTTTCTGTTGGGTCCTTAACG 



TaqI 
CviJI | Bbvl 
I I I 



Mnll 
BssSI 
PstI | 
Fnu4HI | | 
CviRI | j | 
Tsel j j j 
Sfcl | | | | 
I II 



Bbvl 
Bed |Hin4I 

II I 

AGCCTTCGATAATAATACTGCAGCACGAGATGGAGGTGCTATCTGTACACAATCTCTAAC 

1021 + + + + + + 1080 

TCGGAAGCTATTATTATGACGTCGTGCTCTACCTCCACGATAGACATGTGTTAGAGATTG 



Rsal 
BsrGI | 
Tat I j 

I I 



BsmFI 
I 



Sthl32I 
BscGI | 
NlalV 



TspRI 
Avail | 
Sau96I j 
PshAI | j 
Taal | j j 
II I 



Haell 
Fnu4HI | 
Taul j 
Tthlllll Acil |HhaI j 
I II II 

TATACAAGACAGTGGTCCCGTCTATTTCACAAACAATCAGGGAACTTGGGGCGGCGCTAT 

1081 + + + + + + H40 

ATATGTTCTGTCACCAGGGCAGATAAAGTGTTTGTTAGTCCCTTGAACCCCGCCGCGATA 
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Fig. 20 (con't) 

CviRI 
Nlalll 
Nspl 
SphI 

Cac8I | Dpnl 
Hpyl78III CviRI | | Bell | 

Nlalll | Bed | I | Sau3AI j 

I I I I I I II 
CATGCTCCGTCAAGATGGTGCATGCACTTTATTTGCTGATCAGGGAGATATTATTTTTTA 

1141 + + + + + + 1200 

GTACGAGGCAGTTCTACCACGTACGTGAAATAAACGACTAGTCCCTCTATAATAAAAAAT 

Mmel 
Thai 
AflHI | 
Cac8I | 

Nlalll Mlul j 

Bsgl | CviRI | j 

II III 
TAATAATAGACACTTCAAAGATACTTTCAGCAATCATGTTTCTGTAAACTGCACGCGTAA 

1201 + + + + + + 1260 

ATTATTATCTGTGAAGTTTCTATGAAAGTCGTTAGTACAAAGACATTTGACGTGCGCATT 

Dpnl 

Msel Sau3AI I 

BsmAI | Taal Alwl | j 

II I I I I 

TGTCTCATTAACAGTTGGAGCAAGTCAAGGTCATTCTGCTACCTTCTATGATCCCATACT 

126 1 + + + + + + 1320 

ACAGAGTAATTGTCAACCTCGTTCAGTTCCAGTAAGACGATGGAAGATACTAGGGTATGA 

CjePI 
Msel | 

Apol | | Hpyl88IX 
Tsp509I | |Hpyl78III BsaJI | 

III I II 

ACAAAGATATACTATACAAAACTCTATCCAAAAATTTAATCCTAATCCAGAACACCTCGG 

1321 + + + + + + 1380 

TGTTTCTATATGATATGTTTTGAGATAGGTTTTTAAATTAGGATTAGGTCTTGTGGAGCC 

Hpyl78III 
Mspl | 
BsaWI | j 
BspEI j j 

Mnll BciVI | | j Hpyl78III 

BseRl| CjePI Mnll jjj TaqI BssSI | 

II I I III III 
AACTATCTTGTTCTCCTCAACATATATTCCGGATACATCGACTTCTCGTGATGACTTCAT 

1381 + + + + + + 1440 

TTGATAGAACAAGAGGAGTTGTATATAAGGCCTATGTAGCTGAAGAGCACTACTGAAGTA 
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Fig. 20 (con't) 

TaqI 

Rsal Dpnl | 

BsrGI | Sau3AI 
Taal | Bcefl | | | 

Hpyl88IX TatI j Hpyl78III | j 

I II III 

TTCACATTTCAGAAACCACATTGGACTGTACAACGGCACACTCGCTCTTGAAGATCGAGC 

1441 + + + + + + 1500 

AAGTGTAAAGTCTTTGGTGTAACCTGACATGTTGCCGTGTGAGCGAGAACTTCTAGCTCG 

Tsp509I 
Dpnl 
Bell | 

Sau3AI j | BsmFI 
Apol | | j Acelll | Alul 

MboII Tsp509I | | j Plel Hinfl Taal j CviJI 

I I I I I I I II I 

AGAGTGGAAAGTCTATAAATTTGATCAATTTGGTGGGACTCTACGGTTAGGCAGTAGAGC 

1501 + + + + + + 1560 

TCTCACCTTTCAGATATTTAAACTAGTTAAACCACCCTGAGATGCCAATCCGTCATCTCG 

MboII Msel 
RleAI | Tsp509I | 

II II 
TGTGTTTTCTACAACAGACGAAGAACAAAGTAGCAGTAGTGTGGGTTCTGTAATTAACAT 

3.561 + + + + + + 1620 

ACACAAAAGATGTTGTCTGCTTCTTGTTTCATCGTCATCACACCCAAGACATTAATTGTA 

BslI 

Msel PflMI 
Tsp509I | Mnll Alul | 

CviRl| | Ddel | CviJI j 

III II M 

CAATAATCTTGCAATTAACCTTCCCTCTATCTTAGGCAACAGAGTTGCTCCCAAGCTATG 

1621 + + + + + + 1680 

GTTATTAGAACGTTAATTGGAAGGGAGATAGAATCCGTTGTCTCAACGAGGGTTCGATAC 

Hinfl Sfcl 
Tfil RleAI | MboII 

I I I I 

GATTCGCCCCACAGGTTCATCAGCACCCTATAGCGAAGATAATAACCCTATAATCAATCT 

1681 + + + + + + 1740 

CTAAGCGGGGTGTCCAAGTAGTCGTGGGATATCGCTTCTATTATTGGGATATTAGTTAGA 
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Fig. 20 (con't) 



Avail 

ECOO109I Dpnl 

Psp5II BstYI | 

Sau96I Sau3AI j BstAPI 

Sse8647I Bfal | | PstI | 

Hpyl78IIl| BseMII Fokl j j CviRI | | 

Ddel | | CviJI Bsrl Alwl | j j Sfcl | |MwoI 

III I I I I I I Ml I 

CTCAGGACCTTTGAGCCTACTGGATGACGAGAACCTAGATCCCTATGATACTGCAGACCT 

1741 + + + + + + 1800 

GAGTCCTGGAAACTCGGATGACCTACTGCTCTTGGATCTAGGGATACTATGACGTCTGGA 



Aatll 
Maelll 
Tsp45I 
BsaHI | 

Mnll Maell j | Alul Msel 
MboII Earl | Ddel || j CviJI Vspl 

I I I I II I I I 

TGCCCAACCTATCGCAGAAGTTCCTCTTCTGTATCTCTTAGACGTCACAGCTAAACATAT 

1801 + + + + + + I860 

ACGGGTTGGATAGCGTCTTCAAGGAGAAGACATAGAGAATCTGCAGTGTCGATTTGTATA 



Mwol 
BsaJI 
Styl 

BseMII Bsu36I BsmFI | 

Tsp509I |MnlI Ddel SimI Bsbl Cvi JI | j 

I I I I I I M I I 

TAATACGGATAATTTCTACCCTGAGGGTCTAAATACAACTCAACACTACGGCTACCAAGG 

186 1 + + + + + + 1920 

ATTATGCCTATTAAAGATGGGACTCCCAGATTTATGTTGAGTTGTGATGCCGATGGTTCC 



NlalV 
Avail | 
Sau96I j 
Bcefl | | 
I I 



TaqI 
Bsrl | 
Dpnl j 
Sau3AI | j 
BslI | | j Alwl 
I I II I 



Hpyl88IX 
MboII 

I 



Mnll 
Earl | 
Hpyl88IX| | 

II I 



CGTTTGGTCCCCTTACTGGATCGAAACAATCACAACTTCTGATACCTCTTCTGAAGATAC 

1921 + + + + + + 1980 

GCAAACCAGGGGAATGACCTAGCTTTGTTAGTGTTGAAGACTATGGAGAAGACTTCTATG 



BstXI 
Alul | 

MboII CviJI j Sfcl 

Taal | Eco57I Cac8I | j HphI | 

III III II 

TGTGAATACTTTACATCGCCAGCTTTATGGTGATTGGACACCTACAGGATATAAGGTAAA 

1981 + + + + + + 2040 

ACACTTATGAAATGTAGCGGTCGAAATACCACTAACCTGTGGATGTCCTATATTCCATTT 
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Fig. 20 (con't) 

BsmAI BsrDI Mwol 

I I I 

CCCAGAAAACAAAGGAGACATTGCCCTATCTGCCTTCTGGCAATCTTTCCATAACTTATT 

2041 + + + + + + 2100 

GGGTCTTTTGTTTCCTCTGTAACGGGATAGACGGAAGACCGTTAGAAAGGTATTGAATAA 



Hpyl7 8III 
Cjel AlwNI 
Tthlllll| Alul 
CviJI | | CviJI | Plel 

Hael j j Mwol | j Alul | 

Maell Haelll || Sfcl | | j Cvi JI j 

I III I I I I II 

TGCGACACTACGTTATCAAACACAGCAAGGCCAAATAGCACCTACAGCTTCTGGAGAAGC 

2ioi + + + + + + 2160 

ACGCTGTGATGCAATAGTTTGTGTCGTTCCGGTTTATCGTGGATGTCGAAGACCTCTTCG 



Cjel 
Hinfl | CviRI 
Taql | j Earl | 
MboII | j |BpmI | j 
III III 



Hinfl 
Tfil 
XmnI | 
I I 



Sthl32I Alul 
Ndel | CviJI 
I I I 



TACTCGACTCTTCGTGCATCAAAATAGCAACAATGATGCGAAAGGATTCCATATGGAAGC 

2i6i + + + + + + 2220 

ATGAGCTGAGAAGCACGTAGTTTTATCGTTGTTACTACGCTTTCCTAAGGTATACCTTCG 



Tthlllll 
TspRI | 

CjePI Mnll | j Alul 

BSCGI | Btsl| j j CviJI 

II II I I I 

TACGGGTTATTCTTTGGGAACAACCTCAAACACTGCTTCTAATCATAGCTTTGGTGTAAA 

2221 + + + + + + 2280 

ATGCCCAATAAGAAACCCTTGTTGGAGTTTGTGACGAAGATTAGTATCGAAACCACATTT 



BsaJI 
BstDSI 
Tsp509I | 
Hpyl88IX | j 
CviJI | | j CviJI 

Pf 111081 Hin4I j j j BplI | 

I I I I I II 

CTTCTCCCAACTTTTCAGTAATCTCTACGAGAGCCACTCCGACAATTCCGTGGCTTCGCA 

2281 + + + + + + 2340 

GAAGAGGGTTGAAAAGTCATTAGAGATGCTCTCGGTGAGGCTGTTAAGGCACCGAAGCGT 
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Fig. 20 (con't) 



HaelV 
Hin4I 
Bbvl 
Dpnl | 
Sau3AI | 
Hpyl78III | | 
Haell || || CviRI 
Hhal | || jj Fnu4HI | 

Eco47IIl|| jj jj CviJl| I Hin4I 

Mmel Taal | j j j j j j BsaJI | j j Hinf I 
Bpml| Sfcl| jjj jj jj Styl Tsel j j Tfil 

II II III II II I I II I I 

TACGACAACTGTAGCGCTCCAGATCAATAATCCTTGGCTGCAAGAGAGATTCTCTACATC 

2341 + + + + + + 2400 

ATGCTGTTGACATCGCGAGGTCTAGTTATTAGGAACCGACGTTCTCTCTAAGAGATGTAG 



Sfcl 
CviJI 
SfaNI 

Bfal | Sfcl Hpyl78III 

Mwol | Alul | SfaNI | 

CviRI Bpll| | j CviJI | Hpyl78III | j 

I II I I II III 

TGCATCTCTAGCCTACAGCTACAGCAACCACCATATCAAAGCATCTGGATATTCTGGAAA 

2401 + + + + + + 2460 

ACGTAGAGATCGGATGTCGATGTCGTTGGTGGTATAGTTTCGTAGACCTATAAGACCTTT 

CviJI 
Fnu4HI | 
Taul | 

Rsal Acil| j 

I II I 

AATACAAACGGAAGGCAAATGTTATAGTACGACATTAGGGGCGGCTCTCTCTTGCTCTCT 

2461 + + + + + + 2520 

TTATGTTTGCCTTCCGTTTACAATATCATGCTGTAATCCCCGCCGAGAGAGAACGAGAGA 



BsaXI 
Hpyl78III 

Dpnl | Muni 
Sau3AI | j Mnll Bcefl Tsp509I 

III I I I 

ATCTCTACAATGGCGATCACGACCTCTCCACTTCACTCCTTTTATCCAAGCAATTGCCGT 

2521 + + + + + + 2580 

TAGAGATGTTACCGCTAGTGCTGGAGAGGTGAAGTGAGGAAAATAGGTTCGTTAACGGCA 
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Fig. 20 (con't) 



XmnI 
Apol 
Tsp509I 
Bfal I 
Alul | j 

Tthlllll Hpyl78III CviJl| j 

I I II I 

TCGTTCTAATCAAACTGCGTTTCAAGAAAGTGGAGATAAAGCTAGAAAATTTTCTGTTCA 

2581 + + + + + + 2640 

AGCAAGATTAGTTTGACGCAAAGTTCTTTCACCTCTATTTCGATCTTTTAAAAGACAAGT 

Hinf I 
Bsp24I | 

CjePI j 

Cjel| 
Haell | 



Hhal 
Eco47III 
Hpyl88IX 
Mnll 



Apol | Ml | Bbsl 

Hin4I EcoRI j j j j j MboII 

BsmFI Taal| Tsp509I j j | j j Tf il Alol | 

I II I I I I I I I M 

TAAACCCTTATATAACCTGACAGTCCCTCTGGGAATTCAGAGCGCTTGGGAATCCAAGTT 

2641 + + + + + + 2700 

ATTTGGGAATATATTGGACTGTCAGGGAGACCCTTAAGTCTCGCGAACCCTTAGGTTCAA 



Cjel Cac8I 
CjePI Alul | Mnll 

Bsp24l| CviJI j CviJI BseMII | 

II II I M 

CCGTCTTCCTACCTATTGGAACATAGAGCTTGCTTATCAGCCTGTCCTCTACCAACAAAA 

2701 + + + + + + 2760 

GGCAGAAGGATGGATAACCTTGTATCTCGAACGAATAGTCGGACAGGAGATGGTTGTTTT 
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Fig. 20 (con't) 



Dpnl 
Sau3AI 
Bfal 



Hinf I 
Mae 1 1 
Dpnl 
Sau3AI | 
Ddel | I 
Hpyl78III j I 
I I I 



Plel 
Hinfl 
Tfil 
Hpyl78III 
Bfal 
Xbal | 



Drdll 
Nlaiv| 
Hpyl78III 
NlalV 
CviJI | 
NlaIIl| j 
Alwl | | j CjePI 
I III I 



TCCTGAGATCAACGTGAGTCTAGAATCTAGTGGATCGTCATGGCTCCTATCAGGAACCAC 

2761 + + + + + + 2820 

AGGACTCTAGTTGCACTCAGATCTTAGATCACCTAGCAGTACCGAGGATAGTCCTTGGTG 



BsrDI MboII 

BsrDl| Dral MboII | 

Mwol | j Msel | Apol | j 

Cac8I | | j CjePI | j Tsp509I j j 

I I II III III 

CCTTGCTCGCAATGCCATTGCTTTTAAAGGAAGAAACCAAATTTTTATCTTCCCTAAACT 

2821 + + + + + + 2880 

GGAACGAGCGTTACGGTAACGAAAATTTCCTTCTTTGGTTTAAAAATAGAAGGGATTTGA 

Mnll 

Ddel CviJI BciVI | 

I I II 

TTCGGTGTTCTTAGACTATCAAGGCTCGGTATCCTCATCAACGACGACACATTACCTTCA 

2881 + + + + + + 2940 

AAGCCACAAGAATCTGATAGTTCCGAGCCATAGGAGTAGTTGCTGCTGTGTAATGGAAGT 

Dral Nlalll 
Msel Msel | Nspl CviRI 

I II I I 

CGCAGGAACGACCTTTAAGTTTTAAAAGCATGTTATATAGACAATGCAACCTGTAAAGAC 

2941 + + + + + + 3000 

GCGTCCTTGCTGGAAATTCAAAATTTTCGTACAATATATCTGTTACGTTGGACATTTCTG 



CjePI 
I 



Hinfl 
Nlalll 
Tfil 
Hpyl78III | 
Real | j 
Bed | | | 
III 



CjePI 
BpulOI | 
Ddel j 
Alul| j 
CviJI | j 
III 



CAAATAGAGAGTAGTGAACACTCTCTACCATCATGAATCTTATGGGAGAAGCTAAGGGAA 

3001 + + + + + + 3060 

GTTTATCTCTCATCACTTGTGAGAGATGGTAGTACTTAGAATACCCTCTTCGATTCCCTT 
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WO 00/24765 PCT/CA99/00992 

Fig. 20 (con't) 

NspV 
TaqI 

Fokl Hinfl | 

Msel Hin4I Mnll j 

Maell Tsp509I | Sthl32I Bfal Tfil j 

I II I III 

ATCCAC AGATACGTTTCC CCCATAAAAATTAAGAAC C CGATACATC CT CACTAGAGATTC 

3061 + + + + + + 3120 

TAGGTGTCTATGCAAAGGGGGTATTTTTAATTCTTGGGCTATGTAGGAGTGATCTCTAAG 

BsmI 
BpulOI | 
Msel Ddel j TaqI 



CTTTCTTGATGAATTTAGGATTCGTAAGCT 
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Figure 21 A: CPN1 00626 Coding Sequence 

tcctgaactc cactcgaaat tactgattag ccaaggtacg tggacgacgc aggccactcc 60 
tgtgacctac aatgctttag ggatcaaagt gaaaaatacc atgcaggtgt ttcctaaagt 12 0 
cactctctcc ttagattact ctgcggatat ttcttcctcc acgctgagtc actacttaaa 180 
cgtggcgagt agaatgagat ttttaacaat aagtgaccaa aacagaaaga ttaaggaacc 24 0 
tctagtgtca aagactcctc ctaagttttt attctatctc gggaatttca cagcctgcat 300 
gttcgggatg actcctgcag tgtatagttt acaaacggac tcccttgaaa agtttgcttt 3 60 
agagagggat gaagagtttc gtacgagctt tcctctctta gactctctct ccactcttac 42 0 
aggattttct ccaataacta cgtttgttgg aaatagacat aattcctctc aagacattgt 480 
actttctaac tacaagtcta ttgataacat ccttcttctt tggacatcgg ctgggggagc 540 
tgtgtcctgt aataatttct tattatcaaa tgttgaagac catgccttct tcagtaaaaa 600 
tctcgcgatt gggactggag gcgcgattgc ttgccaggga gcctgcacaa tcacgaagaa 660 
tagaggaccc cttatttttt tcagcaatcg aggtcttaac aatgcgagta caggaggaga 72 0 
aactcgtggg ggtgcgattg cctgtaatgg agacttcacg atttctcaaa atcaagggac 780 
tttctacttt gtcaacaatt ccgtcaacaa ctggggagga gccctctcca ccaatggaca 840 
ctgccgcatc caaagcaaca gggcacctct actctttttt aacaatacag cccctagtgg 900 
agggggtgcg cttcgtagtg aaaatacaac gatctctgat aacacgcgtc ctatttattt 960 
taagaacaac tgtgggaaca atggcggggc cattcaaaca agcgttactg ttgcgataaa 1020 
aaataactcc gggtcggtga ttttcaataa caacacagcg ttatctggtt cgataaattc 1080 
aggaaatggt tcaggagggg cgatttatac aacaaaccta tccatagacg ataaccctgg 1140 
aactattctt ttcaataata actactgcat tcgcgatggc ggagctatct gtacacaatt 12 00 
tttgacaatc aaaaatagtg gccacgtata tttcaccaac aatcaaggaa actggggagg 1260 
tgctcttatg ctcctacagg acagcacctg cctactcttc gcggaacaag gaaatatcgc 1320 
atttcaaaat aatgaggttt tcctcaccac atttggtaga tacaacgcca tacattgtac 13 80 
accaaatagc aacttacaac ttggagctaa taaggggtat acgactgctt tttttgatcc 1440 
tatagaacac caacatccaa ctacaaatcc tctaatcttt aatcccaatg cgaaccatca 1500 
gggaacgatc ttattttctt cagcctatat cccagaagct tctgactacg aaaataattt 1560 
cattagcagc tcgaaaaata cctctgaact tcgcaatggt gtcctctcta tcgaggatcg 1620 
tgcgggatgg caattctata agttcactca aaaaggaggt atccttaaat tagggcatgc 1680 
ggcgagtatt gcaacaactg ccaactctga gactccatca actagtgtag gctcccaggt 1740 
catcattaat aaccttgcga ttaacctccc ctcgatctta gcaaaaggaa aagctcctac 1800 
cttgtggatc cgtcctctac aatctagtgc tcctttcaca gaggacaata accctacaat 1860 
tactttatca ggtcctctga cactcttaaa tgaggaaaac cgcgatccct acgacagtat 1920 
agatctctct gagcctttac aaaacattca tcttctttct ttatcggatg taacagcacg 1980 
tcatatcaat accgataact ttcatcctga aagcttaaat gcgactgagc attacggtta 2040 
tcaaggcatc tggtctcctt attgggtaga gacgataaca acaacaaata acgcttctat 2100 
agagacggca aacaccctct acagagctct gtatgccaat tggactccct taggatataa 2160 
ggtcaatcct gaataccaag gagatcttgc tacgactccc ctatggcaat cctttcatac 2220 
tatgttctct ctattaagaa gttataatcg aactggtgat tctgatatcg agaggccttt 2280 
cttagaaatt caagggattg ccgacggcct ctttgttcat caaaatagca tccccggggc 2340 
tccaggattc cgtatccaat ctacagggta ttccttacaa gcatcctccg aaacttcttt 2400 
acatcagaaa atctccttag gttttgcaca gttcttcacc cgcactaaag aaatcggatc 2460 
aagcaacaac gtctcggctc acaatacagt ctcttcactt tatgttgagc ttccgtggtt 2520 
ccaagaggcc tttgcaacat cccacagttt agcgtatggc tatggggacc atcacctcca 2580 
cgcctacatc cgtcacatca agaacagggc agaagggacg tgttatagcc atacattagc 2 64 0 
agcagctatc ggctgttctt tcccttggca acagaaatcc tatcttcacc tcagcccgtt 2700 
cgttcaggca attgcaatac gttctcacca aacagcgttc gaagagattg gtgacaatcc 2760 
ccgaaagttt gtctctcaaa agcctttcta taatctgacc ttacctctag gaatccaagg 2820 
aaaatggcag tcaaaattcc acgtacctac agaatggact ctagaacttt cttaccaacc 2880 
ggtactctat caacaaaatc cccaaatcgg tgtcacgcta cttgcgagcg gaggttcctg 2940 
ggatatccta ggccataact atgttcgcaa tgctttaggg tacaaagtcc acaatcaaac 3000 
tgcgctcttc cgttctctcg atctattctt ggattaccaa ggatcggtct cctcctcgac 3060 
atctacgcac catctccaag cagg'aagtac cttaaaattc taaaataaaa gaacgataaa 312 0 
attgaaatct ttagaattaa caactatccg atgagctacg ttagcccaat cggtagagga 318 0 
ctccctcaaa atttaaataa 3200 
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Figure 21 B: CPN1 00626 Deduced Amino Acid Sequence 

Met Gin Val Phe Pro Lys Val Thr Leu Ser Leu Asp Tyr Ser Ala Asp 
15 10 15 

lie Ser Ser Ser Thr Leu Ser His Tyr Leu Asn Val Ala Ser Arg Met 
20 25 30 

Arg Phe Leu Thr lie Ser Asp Gin Asn Arg Lys He Lys Glu Pro Leu 
35 40 45 

Val Ser Lys Thr Pro Pro Lys Phe Leu Phe Tyr Leu Gly Asn Phe Thr 
50 55 60 

Ala Cys Met Phe Gly Met Thr Pro Ala Val Tyr Ser Leu Gin Thr Asp 



Ser Leu Glu Lys Phe Ala Leu Glu Arg Asp Glu Glu Phe Arg Thr Ser 
85 90 95 

Phe Pro Leu Leu Asp Ser Leu Ser Thr Leu Thr Gly Phe Ser Pro He 
100 105 HO 

Thr Thr Phe Val Gly Asn Arg His Asn Ser Ser Gin Asp He Val Leu 
115 120 125 

Ser Asn Tyr Lys Ser He Asp Asn He Leu Leu Leu Trp Thr Ser Ala 
130 135 140 

Gly Gly Ala Val Ser Cys Asn Asn Phe Leu Leu Ser Asn Val Glu Asp 
145 150 155 160 

His Ala Phe Phe Ser Lys Asn Leu Ala He Gly Thr Gly Gly Ala He 
165 170 175 

Ala Cys Gin Gly Ala Cys Thr He Thr Lys Asn Arg Gly Pro Leu He 
180 185 190 

Phe Phe Ser Asn Arg Gly Leu Asn Asn Ala Ser Thr Gly Gly Glu Thr 
195 200 205 

Arg Gly Gly Ala He Ala Cys Asn Gly Asp Phe Thr He Ser Gin Asn 
210 215 220 

Gin Gly Thr Phe Tyr Phe Val Asn Asn Ser Val Asn Asn Trp Gly Gly 
225 230 235 240 

Ala Leu Ser Thr Asn Gly His Cys Arg He Gin Ser Asn Arg Ala Pro 
245 250 255 

Leu Leu Phe Phe Asn Asn Thr Ala Pro Ser Gly Gly Gly Ala Leu Arg 
260 265 270 

Ser Glu Asn Thr Thr He Ser Asp Asn Thr Arg Pro He Tyr Phe Lys 
275 280 285 
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Fig. 21 B (con't) 

Asn Asn Cys Gly Asn Asn Gly Gly Ala lie Gin Thr Ser Val Thr Val 
290 295 300 

Ala lie Lys Asn Asn Ser Gly Ser Val lie Phe Asn Asn Asn Thr Ala 
305 310 315 320 

Leu Ser Gly Ser lie Asn Ser Gly Asn Gly Ser Gly Gly Ala lie Tyr 
325 330 335 

Thr Thr Asn Leu Ser lie Asp Asp Asn Pro Gly Thr He Leu Phe Asn 
340 345 350 

Asn Asn Tyr Cys He Arg Asp Gly Gly Ala He Cys Thr Gin Phe Leu 
355 360 365 

Thr He Lys Asn Ser Gly His Val Tyr Phe Thr Asn Asn Gin Gly Asn 
370 375 380 

Trp Gly Gly Ala Leu Met Leu Leu Gin Asp Ser Thr Cys Leu Leu Phe 
385 390 395 400 

Ala Glu Gin Gly Asn He Ala Phe Gin Asn Asn Glu Val Phe Leu Thr 
405 410 415 

Thr Phe Gly Arg Tyr Asn Ala He His Cys Thr Pro Asn Ser Asn Leu 
420 425 430 

Gin Leu Gly Ala Asn Lys Gly Tyr Thr Thr Ala Phe Phe Asp Pro He 
435 440 445 

Glu His Gin His Pro Thr Thr Asn Pro Leu He Phe Asn Pro Asn Ala 
450 455 460 

Asn His Gin Gly Thr He Leu Phe Ser Ser Ala Tyr He Pro Glu Ala 
465 470 475 480 

Ser Asp Tyr Glu Asn Asn Phe He Ser Ser Ser Lys Asn Thr Ser Glu 
485 490 495 

Leu Arg Asn Gly Val Leu Ser He Glu Asp Arg Ala Gly Trp Gin Phe 
500 505 510 

Tyr Lys Phe Thr Gin Lys Gly Gly He Leu Lys Leu Gly His Ala Ala 
515 520 525 

Ser He Ala Thr Thr Ala Asn Ser Glu Thr Pro Ser Thr Ser Val Gly 
530 535 540 

Ser Gin Val He He Asn Asn Leu Ala He Asn Leu Pro Ser He Leu 
545 550 555 560 

Ala Lys Gly Lys Ala Pro Thr Leu Trp He Arg Pro Leu Gin Ser Ser 
565 570 575 
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Fig. 21B(con't) 

Ala Pro Phe Thr Glu Asp Asn Asn Pro Thr lie Thr Leu Ser Gly Pro 
580 585 590 

Leu Thr Leu Leu Asn Glu Glu Asn Arg Asp Pro Tyr Asp Ser lie Asp 
595 600 605 

Leu Ser Glu Pro Leu Gin Asn He His Leu Leu Ser Leu Ser Asp Val 
610 615 620 

Thr Ala Arg His He Asn Thr Asp Asn Phe His Pro Glu Ser Leu Asn 
625 630 635 640 

Ala Thr Glu His Tyr Gly Tyr Gin Gly He Trp Ser Pro Tyr Trp Val 
645 650 655 

Glu Thr He Thr Thr Thr Asn Asn Ala Ser He Glu Thr Ala Asn Thr 
660 665 670 

Leu Tyr Arg Ala Leu Tyr Ala Asn Trp Thr Pro Leu Gly Tyr Lys Val 
675 680 685 

Asn Pro Glu Tyr Gin Gly Asp Leu Ala Thr Thr Pro Leu Trp Gin Ser 
690 695 700 

Phe His Thr Met Phe Ser Leu Leu Arg Ser Tyr Asn Arg Thr Gly Asp 
705 710 715 720 

Ser Asp He Glu Arg Pro Phe Leu Glu He Gin Gly He Ala Asp Gly 
725 730 735 

Leu Phe Val His Gin Asn Ser He Pro Gly Ala Pro Gly Phe Arg He 
740 745 750 

Gin Ser Thr Gly Tyr Ser Leu Gin Ala Ser Ser Glu Thr Ser Leu His 
755 760 765 

Gin Lys He Ser Leu Gly Phe Ala Gin Phe Phe Thr Arg Thr Lys Glu 
770 775 780 

He Gly Ser Ser Asn Asn Val Ser Ala His Asn Thr Val Ser Ser Leu 
785 790 795 800 

Tyr Val Glu Leu Pro Trp Phe Gin Glu Ala Phe Ala Thr Ser His Ser 
805 810 815 

Leu Ala Tyr Gly Tyr Gly Asp His His Leu His Ala Tyr He Arg His 
820 825 830 

He Lys Asn Arg Ala Glu Gly Thr Cys Tyr Ser His Thr Leu Ala Ala 
835 840 845 

Ala He Gly Cys Ser Phe Pro Trp Gin Gin Lys Ser Tyr Leu His Leu 
850 855 860 
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Fig. 21 B (con't) 

Ser Pro Phe Val Gin Ala lie Ala 
865 870 

Glu Glu He Gly Asp Asn Pro Arg 
885 

Tyr Asn Leu Thr Leu Pro Leu Gly 
900 

Phe His Val Pro Thr Glu Trp Thr 

915 920 

Leu Tyr Gin Gin Asn Pro Gin He 
930 935 

Gly Ser Trp Asp He Leu Gly His 
945 950 

Tyr Lys Val His Asn Gin Thr Ala 
965 

Leu Asp Tyr Gin Gly Ser Val Ser 
980 

Gin Ala Gly Ser Thr Leu Lys Phe 
995 1000 



He Arg Ser His Gin Thr Ala Phe 
875 880 

Lys Phe Val Ser Gin Lys Pro Phe 
890 895 

He Gin Gly Lys Trp Gin Ser Lys 
905 910 

Leu Glu Leu Ser Tyr Gin Pro Val 
925 

Gly Val Thr Leu Leu Ala Ser Gly 
940 

Asn Tyr Val Arg Asn Ala Leu Gly 

955 960 

Leu Phe Arg Ser Leu Asp Leu Phe 
970 975 

Ser Ser Thr Ser Thr His His Leu 
985 990 
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Figure 22 (RY-45) 

Restriction enzyme analysis of CPN1 00626 



Tsp509I 
TaqI | Cjel 
I I 



BsaJI 
Styl 
Cvi JI | 



BsaAI 
Maell | 
Rsal | j 
III 



CviJI 
Hael 
Haelll 
Cac8I | 
I I 



MslI 
Cjel 
Hgal 



I I 



TCCTGAACTCCACTCGAAATTACTGATTAGCCAAGGTACGTGGACGACGCAGGCCACTCC 

! + + + + + + < 

AGGACTTGAGGTGAGCTTTAATGACTAATCGGTTCCATGCACCTGCTGCGTCCGGTGAGG 

Aarl 
MslI 

Maelll Dpnl BspMI CviRI | Maelll 

Tsp45I Sau3AI | Alwl | Nlalll j Tsp45I 

I I I I I II I 

TGTGACCTACAATGCTTTAGGGATCAAAGTGAAAAATACCATGCAGGTGTTTCCTAAAGT 



ACACTGGATGTTACGAAATCCCTAGTTTCACTTTTTATGGTACGTCCACAAAGGATTTCA 



BseMII 
Bsp24I | 
MboII CjePI j 
Acil| Cjel | j 
II II I 



Maelll 
Tsp45I 
Hinf I | 
Mnll | 
Ddel | j 
' II 



Maell 
Msel | 
Plel | j 
I I I 



CACTCTCTCCTTAGATTACTCTGCGGATATTTCTTCCTCCACGCTGAGTCACTACTTAAA 
GTGAGAGAGGAATCTAATGAGACGCCTATAAAGAAGGAGGTGCGACTCAGTGATGAATTT 



Cjel 

CjePI Maelll 
Bsp24l| Msel Tsp45I Msel NlalV 

II I I I I 

CGTGGCGAGTAGAATGAGATTTTTAACAATAAGTGACCAAAACAGAAAGATTAAGGAACC 

181 + + + + + + : 

GCACCGCTCATCTTACTCTAAAAATTGTTATTCACTGGTTTTGTCTTTCTAATTCCTTGG 



Hinf I 

Mnll | Apol CviRI 

Plel | | , Tsp509I Cac8I | 

BseRl| | | Sthl32I Hpyl78III | Sthl32I j 

Bfal I DdeI Mnl1 I AvaI I I CviJI | | 

I II I I III III I I I 

TCTAGTGTCAAAGACTCCTCCTAAGTTTTTATTCTATCTCGGGAATTTCACAGCCTGCAT 

241 + + + + + + 300 

AGATCACAGTTTCTGAGGAGGATTCAAAAATAAGATAGAGCCCTTAAAGTGTCGGACGTA 
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Fig. 22 (con't) 

BtsI 

Hpyl78III Fokl 
Plel| PstI 
Nlalll | j CviRI | 

Nspl | j Hinfl Sfcl | |TspRI Plel Hinfl Mnll 

I II I I I I I II I 

GTTCGGGATGACTCCTGCAGTGTATAGTTTACAAACGGACTCCCTTGAAAAGTTTGCTTT 

301 + + + + + + 360 

CAAGCCCTACTGAGGACGTCACATATCAAATGTTTGCCTGAGGGAACTTTTCAAACGAAA 



Earl 
I 



Hin4I 
Alul | 
CviJI j 
MboII 
Rsal| 
Fokl | | 
XmnI Sunl j j 
I I II 



BplI 
BsaXI | 
Hin4I 
Mnll 
Hinfl | 
Ddel | | 
Plel | || 
I I II 



AGAGAGGGATGAAGAGTTTCGTACGAGCTTTCCTCTCTTAGACTCTCTCTCCACTCTTAC 

361 + + + + + + 420 

TCTCTCCCTACTTCTCAAAGCATGCTCGAAAGGAGAGAATCTGAGAGAGAGGTGAGAATG 



Hpyl78III Rsal 
Cjel Tsp509I Smll | TatI | 

Mmel I Maell Bce83I |CjeI | |MnlI | j 

I I I I I I I I I I I 

AGGATTTTCTCCAATAACTACGTTTGTTGGAAATAGACATAATTCCTCTCAAGACATTGT 

TCCTAAAAGAGGTTATTGATGCAAACAACCTTTATCTGTATTAAGGAGAGTTCTGTAACA 



Fokl 
I 



Bsp24I 
Cjel 

MboII CjePI 

I 



Alul 

CviJI CviJI 
Acelll | Mwol | 

I I 



ACTTTCTAACTACAAGTCTATTGATAACATCCTTCTTCTTTGGACATCGGCTGGGGGAGC 

481 + + + + + + 540 

TGAAAGATTGATGTTCAGATAACTATTGTAGGAAGAAGAAACCTGTAGCCGACCCCCTCG 



Tsp509I 
Cjel | 
CjePI | | 
Bsp24l|| | 
III 



MboII 
Nlalll | 
Bbsl | j 
Eco57I MboII I j j 

I I I I I 



CjePI 

I 



TGTGTCCTGTAATAATTTCTTATTATCAAATGTTGAAGACCATGCCTTCTTCAGTAAAAA 

54! + + + + + + 600 

ACACAGGACATTATTAAAGAATAATAGTTTACAACTTCTGGTACGGAAGAAGTCATTTTT 
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Fig. 22 (con't) 



Mnll 
Hpyl78III 
CviRI 
Cac8I 
CviJI | 
Nlaiv| j 
Bpml 
ScrFI 
BsaJI 
EcoRII | 
Cac8I 
CjePI 
Bsgl 

Hpyl78III BsrI BsmFI | 

Nrul TthlllIl|HhaI | j 

Thai Mnll | j Thai j j 

I I II I I I 

TCTCGCGATTGGGACTGGAGGCGCGATTGCTTGCCAGGGAGCCTGCACAATCACGAAGAA 

601 + + + + + -" "+ 660 

AGAGCGCTAACCCTGACCTCCGCGCTAACGAACGGTCCCTCGGACGTGTTAGTGCTTCTT 



MboII 
NlalV 
Avail 
ECOO109I 
Psp5II 
Sau96I 
SimI 
I 



Mnll TaqI 
I 



Rsal 
Mnll | 
TatI j 

I I 



TAGAGGACCCCTTATTTTTTTCAGCAATCGAGGTCTTAACAATGCGAGTACAGGAGGAGA 

661 + + + + + + 720 

ATCTCCTGGGGAATAAAAAAAGTCGTTAGCTCCAGAATTGTTACGCTCATGTCCTCCTCT 



BssSI BseRI BsmAI Hpyl78III 

I I I I 

AACTCGTGGGGGTGCGATTGCCTGTAATGGAGACTTCACGATTTCTCAAAATCAAGGGAC 

721 + + + + + + 780 

TTGAGCACCCCCACGCTAACGGACATTACCTCTGAAGTGCTAAAGAGTTTTAGTTCCCTG 



Banll 
Bspl286I 
BsrI | 
Tsp509I Bmrl I CviJI j Fokl 

Hindi | Mnll | |Hin4l| j Mnll 

BsmFI | | Hindi | j j NlalV j | BseRI | BplI 

III I I I I II I M I 

TTTCTACTTTGTCAACAATTCCGTCAACAACTGGGGAGGAGCCCTCTCCACCAATGGACA 

781 + + - --- + + + -- + 840 

AAAGATGAAACAGTTGTTAAGGCAGTTGTTGACCCCTCCTCGGGAGAGGTGGTTACCTGT 
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Fig. 22 (con't) 



TspRI Bspl286I BslI 
Acil| NlaIV| EcoNI | 
Fnu4HI | Bmgl | j Bf al | j 
Taulj SfaNI BseSljj Msel Mnll | | | 
BtsI || Mwol | Banl||| Mnll | CviJI || j | 
III II I I I I II I II I I 
CTGCCGCATCCAAAGCAACAGGGCACCTCTACTCTTTTTTAACAATACAGCCCCTAGTGG 
841 + + + + + + 900 



GACGGCGTAGGTTTCGTTGTCCCGTGGAGATGAGAAAAAATTGTTATGTCGGGGATCACC 



Hpyl88IX 

Hgal | Thai 

Pflll08I Dpnl | j AflHI | 

Hhal | Sau3AI | | j Mlul j CjePI Msel 

II I I II II II 

AGGGGGTGCGCTTCGTAGTGAAAATACAACGATCTCTGATAACACGCGTCCTATTTATTT 

901 + + + + + + 960 

TCCCCCACGCGAAGCATCACTTTTATGTTGCTAGAGACTATTGTGCGCAGGATAAATAAA 



CviJI 
Haelll 
CjePI I 
Faul Nlaiv| 
Sthl32l| Sau96l|| Tthlllll 
RleAI Taal | | Acil | j j Maelll Taal | 

I I II I III III 

TAAGAACAACTGTGGGAACAATGGCGGGGCCATTCAAACAAGCGTTACTGTTGCGATAAA 

961 + + + + + + 1020 

ATTCTTGTTGACACCCTTGTTACCGCCCCGGTAAGTTTGTTCGCAATGACAACGCTATTT 



SimI Apol 

Neil | Tsp509I 

ScrFI | Taql | 

Sthl32I Mspl| | HphI Bsbl Drdll | j 

I II I II III 

AAATAACTCCGGGTCGGTGATTTTCAATAACAACACAGCGTTATCTGGTTCGATAAATTC 

1021 + + + + + + 1080 

TTTATTGAGGCCCAGCCACTAAAAGTTATTGTTGTGTCGCAATAGACCAAGCTATTTAAG 



Hpyl78III 

Drdll ScrFI 

Mnll I BsaJI | 

Hpyl78III XmnlM EcoRII j 

I III M 

AGGAAATGGTTCAGGAGGGGCGATTTATACAACAAACCTATCCATAGACGATAACCCTGG 

1081 + + + + + + 1140 

TCCTTTACCAAGTCCTCCCCGCTAAATATGTTGTTTGGATAGGTATCTGCTATTGGGACC 
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Fig. 22 (con't) 



Aim 

CviJI 

Ecil | 

Acil| j 

Bed | | j Tsp509I 

Hpyl78III Ml j Rsal | 

BsmI Nrul jjj j BsrGI | j 

CviRI Thai III | Tat I | j 

I I I II I Ml 

AACTATTCTTTTCAATAATAACTACTGCATTCGCGATGGCGGAGCTATCTGTACACAATT 

1141 + + + + + + 1200 

TTGATAAGAAAAGTTATTATTGATGACGTAAGCGCTACCGCCTCGATAGACATGTGTTAA 



BsaAI 
HphI 
Maell 
CviJI 
Hael 

Haelll | | BsrI 
MscI | j Bmrl | 

Eael | | j Mnll | j 

I I II III 
TTTGACAATCAAAAATAGTGGCCACGTATATTTCACCAACAATCAAGGAAACTGGGGAGG 

1201 + + + + + + 1260 

AAACTGTTAGTTTTTATCACCGGTGCATATAAAGTGGTTGTTAGTTCCTTTGACCCCTCC 



MboII Acil 
BsiHKAI Aarl | Earl 

Bspl286I Sfcl AlwNI j BspMI Thai 

I I II I I 

TGCTCTTATGCTCCTACAGGACAGCACCTGCCTACTCTTCGCGGAACAAGGAAATATCGC 

126 1 + + + + + + 1320 

ACGAGAATACGAGGATGTCCTGTCGTGGACGGATGAGAAGCGCCTTGTTCCTTTATAGCG 

Rsal 

Cjel BsrGI | 

Cjel | Mnll Cjel | j 

Mnll HphI | HgiEIl| Cjel Cjel |TatI | 

I II III I I I I 

ATTTCAAAATAATGAGGTTTTCCTCACCACATTTGGTAGATACAACGCCATACATTGTAC 

1321 + + + + + + 1380 

TAAAGTTTTATTACTCCAAAAGGAGTGGTGTAAACCATCTATGTTGCGGTATGTAACATG 

Fokl 
Sfcl 
Dpnl | 

Alul BstZ17I Sau3AI | | 

Cjel CviJI Accl| Alwl | j j 

I I II IN I 

ACCAAATAGCAACTTACAACTTGGAGCTAATAAGGGGTATACGACTGCTTTTTTTGATCC 

1381 + + + + + + 1440 

TGGTTTATCGTTGAATGTTGAACCTCGATTATTCCCCATATGCTGACGAAAAAAACTAGG 
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Fig. 22 (con't) 

Mmel 

Mnll | Bed 
Msel j Drdll | 

I I I I 

T AT AGAAC AC C AAC AT C C AAC T AC AAAT C C T CT AAT C TTT AAT C C C AAT G CG AAC CAT C A 

1441 + + + + + + 1500 

ATATCTTGTGGTTGTAGGTTGATGTTTAGGAGATTAGAAATTAGGGTTACGCTTGGTAGT 

Pflll08I 

MboII Hpyl88IX | 

Dpnl | Alul | j 

Sau3AI | | CviJI j j 

Eco57I | j | CviJI Hindlll | | |Tsp509I 

I I II I I I I I I 

GGGAACGATCTTATTTTCTTCAGCCTATATCCCAGAAGCTTCTGACTACGAAAATAATTT 

15 0l + + + --- + + + 1560 

CCCTTGCTAGAATAAAAGAAGTCGGATATAGGGTCTTCGAAGACTGATGCTTTTATTAAA 



Dpnl 

TaqI Faul | 

Alul | Sau3AI j 

CviJI | Hpyl88IX Sthl32I | j 

Fnu4HI | | Acelll | Bcgl Mnll | j j 

Tsel| j j Bbvl| j Mnll BsrDI | Mnll TaqI ||| | 

II I I III I II I I III I 

CATTAGCAGCTCGAAAAATACCTCTGAACTTCGCAATGGTGTCCTCTCTATCGAGGATCG 

1561 + + + + + + 1620 

GTAATCGTCGAGCTTTTTATGGAGACTTGAAGCGTTACCACAGGAGAGATAGCTCCTAGC 



Alwl Tsp509I Fokl 
Acil| Bed |BcgI | 

II I I I I 



BciVI 
Tsp509I | 
Msel | j 
I I I 



Fnu4HI 
Taul 
Acil | 
Nlalll | 
Nspl j 
SphI j 
Cac8I | | 
I II 



TGCGGGATGGCAATTCTATAAGTTCACTCAAAAAGGAGGTATCCTTAAATTAGGGCATGC 



ACGCCCTACCGTTAAGATATTCAAGTGAGTTTTTCCTCCATAGGAATTTAATCCCGTACG 



Bed 
Hin4I 
Hinfl 

BseMII Ddel | 

BstAPl| Hpyl8 8IX 
CviRI | j Plel | 

Mwol | Mwol j BsmAI | | 
II II III 



Bfal 
Spel| 
II 



ScrFI 
BsaJI | 
EcoRII | 
NlalV | j 
CviJI | j j 
II I I 



GGCGAGTATTGCAACAACTGCCAACTCTGAGACTCCATCAACTAGTGTAGGCTCCCAGGT 



CCGCTCATAACGTTGTTGACGGTTGAGACTCTGAGGTAGTTGATCACATCCGAGGGTCCA 
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Fig. 22 (con't) 



Msel 
Vspl 



Mnll 
Ddel 
Dpnl 



Mnll 
Sau3AI | 
Taql| | 

II I 



Alul 
CviJI 
I 



CATCATTAATAACCTTGCGATTAACCTCCCCTCGATCTTAGCAAAAGGAAAAGCTCCTAC 
GTAGTAATTATTGGAACGCTAATTGGAGGGGAGCTAGAATCGTTTTCCTTTTCGAGGATG 



Hin4I 
Dpnl 
NlalV 
BamHI 
BstYI 
Sau3AI 
Alwl | 
I 



Mnll 
BsiHKAI | 
Bspl286I j 
Bfal | j 

Mnll j j Tsp509I 

I I I I 

CTTGTGGATCCGTCCTCTACAATCTAGTGCTCCTTTCACAGAGGACAATAACCCTACAAT 

180 l + + + + + + I860 

GAACACCTAGGCAGGAGATGTTAGATCACGAGGAAAGTGTCTCCTGTTATTGGGATGTTA 



Hpyl8 8IX 
Avail 
ECO0109I 
Psp5II 
Sau96I 
Sse8647I 
I 



Msel 
Mnll | 
Mnll | | 
I I I 



Pflll08I 
Dpnl | 
Sau3AI 
Thai | 
Acil | j 
Alwl | j j 
I I II 



BseMII 
Taal | 

I I 



TACTTTATCAGGTCCTCTGACACTCTTAAATGAGGAAAACCGCGATCCCTACGACAGTAT 
ATG AAATAGT C CAGG AGACTGTGAGAATTTACT C CTTTTGG CG CTAGGGATG CTGTC AT A 



CviJI 
Ddel | 
Hpyl8 8IX | 
Dpnl 
Bglll | 
BstYI j 
Sau3AI | 
I 



MboII 
I 



Maelll Fokl 
Hpyl8 8IX | Maell| 

' ■ II 



1921 



AGATCTCTCTGAGGCTTTACAAAACATTCATCTTCTTTCTTTATCGGATGTAACAGCACG 
TCTAGAGAGACTCGGAAATGTTTTGTAAGTAGAAGAAAGAAATAGCCTACATTGTCGTGC 
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Fig. 22 (con't) 



BseMII 
Msel | 
Alul | | 
CviJI j | 
Hindlll | j j 

Fokl Hpyl78III III Ddel Taal 

I I I I II I I 

TCATATCAATACCGATAACTTTCATCCTGAAAGCTTAAATGCGACTGAGCATTACGGTTA 

1981 + + + + + + 2040 

AGTATAGTTATGGCTATTGAAAGTAGGACTTTCGAATTTACGCTGACTCGTAATGCCAAT 

BsmAI 

Bsal | Sfcl 

BsmAI | BsmAI | 

SfaNI |BsmBI BsmBlj 

III M 
TCAAGGCATCTGGTCTCCTTATTGGGTAGAGACGATAACAACAACAAATAACGCTTCTAT 

2041 + + + + + + 2100 

AGTTCCGTAGACCAGAGGAATAACCCATCTCTGCTATTGTTGTTGTTTATTGCGAAGATA 

Ban 1 1 
BsiHKAI 
Bspl286I 
CjePI 
Sad 
Alul | 
CviJI j 
Mnll j Muni 
Tthlllll| Plel 
Bcefl | j Tsp509I Bsu36I HaelV 

Sfcl | | j Mwol | Hinfl Ddel Hin4I 

I I II I I I I I 

AGAGACGGCAAACACCCTCTACAGAGCTCTGTATGCCAATTGGACTCCCTTAGGATATAA 

2ioi + + + + + + 2160 

TCTCTGCCGTTTGTGGGAGATGTCTCGAGACATACGGTTAACCTGAGGGAATCCTATATT 



Dpnl 
Bglll | 
BstYI j 
Sau3AI j Hinfl 
Hpyl78III BsaJI | j Pf 111081 | 

CjePI | Styl | j Plel | j 

I I I I I I I I 

GGTCAATCCTGAATACCAAGGAGATCTTGCTACGACTCCCCTATGGCAATCCTTTCATAC 

2161 + + + + + + 2220 

CCAGTTAGGACTTATGGTTCCTCTAGAACGATGCTGAGGGGATACCGTTAGGAAAGTATG 
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Fig. 22 (con't) 



Msel 
I 



TaqI 
I 



EcoRV| 
Mnll 
Hpyl88IX I 
Hinfl 
Tfil 
BsrI | 
II 



CviJI 
Hael 
Haelll 
Hpyl78III 
TaqI 
HphI 



TATGTTCTCTCTATTAAGAAGTTATAATCGAACTGGTGATTCTGATATCGAGAGGCCTTT 



ATACAAGAGAGATAATTCTTCAATATTAGCTTGACCACTAAGACTATAGCTCTCCGGAAA 



NlalV 
Sthl32I 
CviJI 
SfaNI 
Neil 
ScrFI 
Smal 
BsaJI 
Mspl 
Neil 

Sthl32I ScrFI 

Apol Beef I | Aval| 

Tsp509I CviJI Mnll | | BsaJI | 

Ddel | Haelll Fokl | j Bpml j BsaJI | j | 

II I I I I II I I I I 

CTTAGAAATTCAAGGGATTGCCGACGGCCTCTTTGTTCATCAAAATAGCATCCCCGGGGC 

2281 + + + + + + 2340 

GAATCTTTAAGTTCCCTAACGGCTGCCGGAGAAACAAGTAGTTTTATCGTAGGGGCCCCG 



HaelV 
Hin4I 
Hinfl 
Tfil 
ScrFI 
Banll | 
Bspl286I I 
EcoRII | 
I 



Mnll 
Tthlllll | 

BciVI SfaNI | j 

Sfcl | Fokl Hpyl88IX | j j 

.... II I I I M 

TCCAGGATTCCGTATCCAATCTACAGGGTATTCCTTACAAGCATCCTCCGAAACTTCTTT 

2341 + + + + + + 2400 

AGGTCCTAAGGCATAGGTTAGATGTCCCATAAGGAATGTTCGTAGGAGGCTTTGAAGAAA 
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Fig. 22 (con't) 



Taal 
HphI | 

Bsu3 6I CviRI | j 

Hpyl88IX Ddel MboIl| | | 

I I II I I 

ACATCAGAAAATCTCCTTAGGTTTTGCACAGTTCTTCACCCGCACTAAAGAAATCGGATC 

2401 + + + + + + 2460 

TGTAGTCTTTTAGAGGAATCCAAAACGTGTCAAGAAGTGGGCGTGATTTCTTTAGCCTAG 



Faul Dpnl 
Sthl32l| Sau3AI | 
Acil | |Hpyl88IX| j 

II II I 



CviJI 
BsmAI | 
BsmBI | 
Tthlllll | 
Maell | | 

Alwl | j | 

II II 



CjePI 
Taal Earl | 

MboII | BsmAI | j 
II I II 

AAGCAACAACGTCTCGGCTCACAATACAGTCTCTTCACTTTATGTTGAGCTTCCGTGGTT 

2461 + + + + + + 2520 

TTCGTTGTTGCAGAGCCGAGTGTTATGTCAGAGAAGTGAAATACAACTCGAAGGCACCAA 



BsaJI 
BstDSI NlalV 
Alul |DrdIl| 
CviJI j Mnllj 

II II 



CviJI 
Hael 
Haelll 
StuI 
Fokl | 
I 



CviRI 
CjePI 



Taal 
I 



Fokl 
Bed 
NlalV 
Avail | 
CviJI Sau96l| 
RleAI | HphI | j 
II I II 



BsmFI 
I 



CCAAGAGGCCTTTGCAACATCCCACAGTTTAGCGTATGGCTATGGGGACCATCACCTCCA 

2521 + + + + + + 2580 

GGTTCTCCGGAAACGTTGTAGGGTGTCAAATCGCATACCGATACCCCTGGTAGTGGAGGT 



Madll 

Tsp45I AflHI BsmFI Fnu4HI 

Mnll |Hpyl78III Maell CviJI | Tsel | 

III I II II 

CGCCTACATCCGTCACATCAAGAACAGGGCAGAAGGGACGTGTTATAGCCATACATTAGC 

2581 + + + + + + 2640 

GCGGATGTAGGCAGTGTAGTTCTTGTCCCGTCTTCCCTGCACAATATCGGTATGTAATCG 



Alul 
CviJI 
Fnu4HI 
Tsel | 
II I 



Bbvl 

CviJI | BsaJI HphI 
Bbvl | j Styl MboII | 

... II I I II 

AGCAGCTATCGGCTGTTCTTTCCCTTGGCAACAGAAATCCTATCTTCACCTCAGCCCGTT 

2641 + + + + + + 2700 

TCGTCGATAGCCGACAAGAAAGGGAACCGTTGTCTTTAGGATAGAAGTGGAGTCGGGCAA 



Mnll 
BscGI 
CviJI 
BbvCI | 
BpulOI j 
Ddel | 
I 
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Fig. 22 (con't) 



Cjel 
MboII 
Maelll 
Tsp45I 
Tthlllll | 
NspV | | 

Taql | | 

Earl | | | 

II II 

CGTTCAGGCAATTGCAATACGTTCTCACCAAACAGCGTTCGAAGAGATTGGTGACAATCC 

2701 + + + + + + 2760 

GCAAGTCCGTTAACGTTATGCAAGAGTGGTTTGTCGCAAGCTTCTCTAACCACTGTTAGG 



Muni Cjel 
Tsp509I Mae I I 

BseMII | HphI | 

Sthl32I | jcviRI | j 

II I IN 



BsaJI 
Styl 
Mnll | 
Hinfl | j 

Sthl32I Tfil j | 

HphI I BsmAI CviJI Hpyl88IX Hin4I Bfal | | | 

I I I I I I I I M 

CCGAAAGTTTGTCTCTCAAAAGCCTTTCTATAATCTGACCTTACCTCTAGGAATCCAAGG 

2761 + + + + + + 2820 

GGCTTTCAAACAGAGAGTTTTCGGAAAGATATTAGACTGGAATGGAGATCCTTAGGTTCC 



Sfcl Hpyl78III 
Rsal | Bfal | 

Apol BsaAI | j Xbal | | 

Tsp509I Maell| | j Plel Hinfl ||| 
I II I I I I Ml 



Mspl 
BsaWI | 
BsrFI j 
PinAI | 
II 



AAAATGGCAGTCAAAATTCCACGTACCTACAGAATGGACTCTAGAACTTTCTTACCAACC 

2 8 21 + + + + + + 2880 

TTTTACCGTCAGTTTTAAGGTGCATGGATGTCTTACCTGAGATCTTGAAAGAATGGTTGG 



ScrFI 
BsaJI | 
ECORII | 
NlaIV| 
Ppil 
Acil | 

Maelll BsrBI j 

Tsp45I Cac8I | j 

Rsal BslI | Mnll | j | 

I II I I I I 

GGTACTCTATCAACAAAATCCCCAAATCGGTGTCACGCTACTTGCGAGCGGAGGTTCCTG 



CCATGAGATAGTTGTTTTAGGGGTTTAGCCACAGTGCGATGAACGCTCGCCTCCAAGGAC 
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Fig. 22 (con't) 



CviJI 
Hael 
Haelll 
Bfal 
Avrll | 
BsaJI j 
Styl | 

EcoRV | | | MslI BsrDI Rsal MboII 

I II I I I I I 

GGATATCCTAGGCCATAACTATGTTCGCAATGCTTTAGGGTACAAAGTCCACAATCAAAC 

2941 + + + + + + 3000 

CCTATAGGATCCGGTATTGATACAAGCGTTACGAAATCCCATGTTTCAGGTGTTAGTTTG 



BseRI 
BslI 
Dpnl 

Dpnl Sau3AI 
Sau3AI | BseRI | 

Hpyl78IIl| | BsaJI || | Bsal 

Earl | j | Styl | j j BsmAI 

Hhal Sapl Taql| j Taqll | jj j Alwl | TaqI 

I I III I I II I I I I 

TGCGCTCTTCCGTTCTCTCGATCTATTCTTGGATTACCAAGGATCGGTCTCCTCCTCGAC 

3001 + + + + + + 3060 

ACGCGAGAAGGCAAGAGAGCTAGATAAGAACCTAATGGTTCCTAGCCAGAGGAGGAGCTG 



Apol 
Tsp509I 
Tthlllll | 

Mnll Msel| j 

Mnll | Bed Rsal | j | Tsp509I 

III I II I I 

ATCTACGCACCATCTCCAAGCAGGAAGTACCTTAAAATTCTAAAATAAAAGAACGATAAA 

3061 + + + + + + 3120 

TAGATGCGTGGTAGAGGTTCGTCCTTCATGGAATTTTAAGATTTTATTTTCTTGCTATTT 



CviJI 
Mwol | 
Mae I I | j 
Msel Alul | | j 

Tsp509I | Hpyl88IX CviJI j j 

II I III 

ATTGAAATCTTTAGAATTAACAACTATCCGATGAGCTACGTTAGCCCAATCGGTAGAGGA 

3121 + + + + + + 3180 

TAACTTTAGAAATCTTAATTGTTGATAGGCTACTCGATGCAATCGGGTTAGCCATCTCCT 



Plel 
Mnll | Hinfl 

I I 
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Fig. 22 (con't) 

Dral 
Mnll 
Swal 
Msel | 
Apol | | 
Tsp509I | | 
I II 
CTCCCTCAAAATTTAAATAA 

3181 -- + + 3200 

GAGGGAGTTTTAAATTTATT 
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Figure 23: 

tagacactat aaaacaaatt atagacaaaa aatctagcat tgatttattc agaatatttc 60 

tttctatttg tgaacgagta tgcgcttttt ttgcttcgga atg ttg ctt cct ttt 115 

Met Leu Leu Pro Phe 



act ttt gta ttg get aat gaa ggt etc caa ctt cct ttg gag acc tat 
Thr Phe Val Leu Ala Asn Glu Gly Leu Gin Leu Pro Leu Glu Thr Tyr 



att aca tta agt cct gaa tat caa gca gec cct caa gta ggg ttt act 
lie Thr Leu Ser Pro Glu Tyr Gin Ala Ala Pro Gin Val Gly Phe Thr 



cat aac caa aat caa gat etc gca att gtc ggg aat cac aat gat ttc 
His Asn Gin Asn Gin Asp Leu Ala lie Val Gly Asn His Asn Asp Phe 



ate ttg gac tat aag tac tat egg teg aat gga ggt get ctt acc tgt 3 07 
lie Leu Asp Tyr Lys Tyr Tyr Arg Ser Asn Gly Gly Ala Leu Thr Cys 



aag aat ctt ctg ate tct gaa aat ata ggg aat gtc ttc ttt gag aag 
Lys Asn Leu Leu lie Ser Glu Asn lie Gly Asn Val Phe Phe Glu Lys 



aat gtc tgt ccc aat tct ggc ggg gca att tat get get caa aat tgc 

Asn Val Cys Pro Asn Ser Gly Gly Ala lie Tyr Ala Ala Gin Asn Cys 
90 95 100 

acg ate tec aag aat cag aac tat gca ttt act aca aac ttg gtc tct 

Thr lie Ser Lys Asn Gin Asn Tyr Ala Phe Thr Thr Asn Leu Val Ser 
105 110 115 

gac aat cct aca gec act gcg gga tea eta ttg ggt gga get etc ttt 

Asp Asn Pro Thr Ala Thr Ala Gly Ser Leu Leu Gly Gly Ala Leu Phe 
120 125 130 

gee ata aat tgc tct att act aat aac eta gga cag gga act ttc gtt 

Ala He Asn Cys Ser He Thr Asn Asn Leu Gly Gin Gly Thr Phe Val 
135 140 145 

gac aat etc get tta aat aag ggg ggt gec etc tat act gag acg aac 

Asp Asn Leu Ala Leu Asn Lys Gly Gly Ala Leu Tyr Thr Glu Thr Asn 

150 155 160 165 

tta tct att aaa gac aat aaa ggc ccg ate ata ate aag cag aat egg 

Leu Ser He Lys Asp Asn Lys Gly Pro He He He Lys Gin Asn Arg 
170 175 180 

gca eta aat teg gac agt tta gga gga ggg att tat agt ggg aac tct 

Ala Leu Asn Ser Asp Ser Leu Gly Gly Gly He Tyr Ser Gly Asn Ser 
185 190 195 
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Fig. 23 (con't) 

eta aat ata gag gga aat tct gga get ata cag ate aca age aac tct 

Leu Asn He Glu Gly Asn Ser Gly Ala He Gin He Thr Ser Asn Ser 
200 205 210 

tea gga tct ggg gga ggc ata ttt tct acc caa aca etc acg ate tec 
Ser Gly Ser Gly Gly Gly He Phe Ser Thr Gin Thr Leu Thr He Ser 
215 220 225 

teg aat aaa aaa etc ata gaa ate agt gaa aat tec gcg ttc gca aat 
Ser Asn Lys Lys Leu He Glu He Ser Glu Asn Ser Ala Phe Ala Asn 
230 235 240 245 

aac tat gga teg aac ttc aat cca gga gga gga ggt ctt act acc acc 
Asn Tyr Gly Ser Asn Phe Asn Pro Gly Gly Gly Gly Leu Thr Thr Thr 
250 255 260 

ttt tgc acg ata ttg aac aac cga gaa ggg gta etc ttt aac aat aac 
Phe Cys Thr He Leu Asn Asn Arg Glu Gly Val Leu Phe Asn Asn Asn 
265 270 275 

caa age cag age aac ggt gga gee att cat gcg aaa tct ate att ate 
Gin Ser Gin Ser Asn Gly Gly Ala He His Ala Lys Ser He He He 
280 285 290 

aaa gaa aat ggt cct gta tac ttt tta aat aac act gca act egg gga 
Lys Glu Asn Gly Pro Val Tyr Phe Leu Asn Asn Thr Ala Thr Arg Gly 
295 300 305 

ggg get etc etc aac tta tea gca ggt tct gga aac gga age ttc ate 
Gly Ala Leu Leu Asn Leu Ser Ala Gly Ser Gly Asn Gly Ser Phe He 
310 315 320 325 

tta tct gca gat aat gga gat att ate ttt aac aat aat acg gec tec 
Leu Ser Ala Asp Asn Gly Asp He He Phe Asn Asn Asn Thr Ala Ser 
330 335 340 

aag cat gee etc aat cct cca tac aga aac gec att cac teg act cct 
Lys His Ala Leu Asn Pro Pro Tyr Arg Asn Ala He His Ser Thr Pro 
345 350 355 

aat atg aat ctg caa ata gga gec cgt ccc ggc tat cga gtg ctg ttc 
Asn Met Asn Leu Gin He Gly Ala Arg Pro Gly Tyr Arg Val Leu Phe 
360 365 370 

tat gat ccc ata gaa cat gag etc cct tec tec ttc ccc ata etc ttt 
Tyr Asp Pro He Glu His Glu Leu Pro Ser Ser Phe Pro He Leu Phe 
375 380 385 

aat ttc gaa acc ggt cat aca ggt aca gtt tta ttt tea ggg gaa cat 
Asn Phe Glu Thr Gly His Thr Gly Thr Val Leu Phe Ser Gly Glu His 
390 395 400 405 
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Fig. 23 (con't) 

gta cac cag aac ttt acc gat gaa atg aat ttc ttt tec tat tta agg 1363 

Val His Gin Asn Phe Thr Asp Glu Met Asn Phe Phe Ser Tyr Leu Arg 

410 415 420 

aac act teg gaa eta cgt caa gga gtc ctt get gtt gaa gat ggt gcg 1411 

Asn Thr Ser Glu Leu Arg Gin Gly Val Leu Ala Val Glu Asp Gly Ala 
425 430 435 

ggg ctg gec tgc tat aag ttc ttc caa cga gga ggc act eta ctt eta 1459 

Gly Leu Ala Cys Tyr Lys Phe Phe Gin Arg Gly Gly Thr Leu Leu Leu 
440 445 450 

ggt caa ggt gcg gtg ate acg aca gca gga acg att ccc aca cca tec 1507 

Gly Gin Gly Ala Val lie Thr Thr Ala Gly Thr lie Pro Thr Pro Ser 
455 460 465 

tea aca cca acg aca gta gga agt act ata act tta aat cac att gec 1555 

Ser Thr Pro Thr Thr Val Gly Ser Thr He Thr Leu Asn His He Ala 
470 475 480 485 

att gac ctt cct tct att ctt tct ttt caa get cag get cca aaa att 1603 

He Asp Leu Pro Ser He Leu Ser Phe Gin Ala Gin Ala Pro Lys He 

490 495 500 

tgg att tac ccc aca aaa aca gga tct acc tat act gaa gat tec aac 1651 

Trp He Tyr Pro Thr Lys Thr Gly Ser Thr Tyr Thr Glu Asp Ser Asn 
505 510 515 

ccg aca ate aca ate tea gga act etc acc tta cgc aac age aac aac 1699 

Pro Thr He Thr He Ser Gly Thr Leu Thr Leu Arg Asn Ser Asn Asn 
520 525 530 

gaa gat ccc tac gat agt ctg gat etc teg cac tct ctt gag aaa gtt 1747 

Glu Asp Pro Tyr Asp Ser Leu Asp Leu Ser His Ser Leu Glu Lys Val 
535 540 545 

ccc ctt ctt tat att gtc gat gtc get gca caa aaa att aac tct teg 1795 

Pro Leu Leu Tyr He Val Asp Val Ala Ala Gin Lys He Asn Ser Ser 
550 555 560 565 

caa ctg gat eta tec aca tta aat tct ggc gaa cac tat ggg tat caa 1843 

Gin Leu Asp Leu Ser Thr Leu Asn Ser Gly Glu His Tyr Gly Tyr Gin 

570 575 580 

ggc ate tgg teg acc tat tgg gta gaa act aca aca ate acg aac cct 1891 

Gly He Trp Ser Thr Tyr Trp Val Glu Thr Thr Thr He Thr Asn Pro 
585 590 595 

aca tct eta eta ggc gcg aat aca aaa cac aag ctg etc tat gca aac 1939 

Thr Ser Leu Leu Gly Ala Asn Thr Lys His Lys Leu Leu Tyr Ala Asn 
600 605 610 
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Fig. 23 (con't) 

tgg tct cct eta ggc tac cgt cct cat ccc gaa cgt cga gga gaa ttc 1987 

Trp Ser Pro Leu Gly Tyr Arg Pro His Pro Glu Arg Arg Gly Glu Phe 
615 620 625 

att acg aat gec ttg tgg caa teg gca tat acg get ctt gca gga etc 2035 

He Thr Asn Ala Leu Trp Gin Ser Ala Tyr Thr Ala Leu Ala Gly Leu 

630 635 640 645 

cac tec etc tec tec tgg gat gaa gag aag ggt cat gca get tec eta 2 083 

His Ser Leu Ser Ser Trp Asp Glu Glu Lys Gly His Ala Ala Ser Leu 
650 655 660 

caa ggc att ggt ctt ctg gtt cat caa aaa gac aaa aac ggt ttt aag 2131 

Gin Gly He Gly Leu Leu Val His Gin Lys Asp Lys Asn Gly Phe Lys 
665 670 675 

gga ttt cgt agt cat atg aca ggt tat agt get ace ace gaa gca acc . 2179 

Gly Phe Arg Ser His Met Thr Gly Tyr Ser Ala Thr Thr Glu Ala Thr 
680 685 690 

tct tct caa agt ccg aat ttc tct tta gga ttt get cag ttc ttc tec 2227 

Ser Ser Gin Ser Pro Asn Phe Ser Leu Gly Phe Ala Gin Phe Phe Ser 
695 700 705 

aaa get aaa gaa cat gaa tct caa aat age acg tec tct cac cac tat 2275 

Lys Ala Lys Glu His Glu Ser Gin Asn Ser Thr Ser Ser His His Tyr 

710 715 720 725 

ttc tct gga atg tgc ata gca aaa tac tct ctt caa aga gtg ata cgt 2323 

Phe Ser Gly Met Cys He Ala Lys Tyr Ser Leu Gin Arg Val He Arg 
730 735 740 

eta tct gtg tct ctt get tat atg ttt acc teg gaa cat acc cat aca 2371 

Leu Ser Val Ser Leu Ala Tyr Met Phe Thr Ser Glu His Thr His Thr 
745 750 755 

atg tat cag ggt etc ctg gaa ggg aac tct cag gga tct ttc cac aac 2419 

Met Tyr Gin Gly Leu Leu Glu Gly Asn Ser Gin Gly Ser Phe His Asn 
760 765 770 

cat acc tta gca ggg get etc tec tgt gtt ttc tta cct caa cct cac 2467 
His Thr Leu Ala Gly Ala Leu Ser Cys Val Phe Leu Pro Gin Pro His 
775 780 785 

ggc gag tec ctg cag ate tat ccc ttt att act gec tta gec ate cga 2515 
Gly Glu Ser Leu Gin He Tyr Pro Phe He Thr Ala Leu Ala He Arg 

790 795 800 805 

gga aat ctt get gcg ttt caa gaa tct gga gac cat get egg gaa ttt 2 5 63 
Gly Asn Leu Ala Ala Phe Gin Glu Ser Gly Asp His Ala Arg Glu Phe 
810 815 820 
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tec eta cac cgc ccc eta acg gac gtc tec etc cct gta gga ate cgc 2611 

Ser Leu His Arg Pro Leu Thr Asp Val Ser Leu Pro Val Gly lie Arg 

825 830 835 

get tct tgg aag aac cac cac cga gtt ccc eta gtc tgg etc aca gaa 2659 
Ala Ser Trp Lys Asn His His Arg Val Pro Leu Val Trp Leu Thr Glu 
840 845 850 

att tec tat cgc tct act etc tat agg caa gat cct gaa etc cac teg 2707 
lie Ser Tyr Arg Ser Thr Leu Tyr Arg Gin Asp Pro Glu Leu His Ser 
855 860 865 

aaa tta ctg att age caa ggt acg tgg acg acg cag gec act cct gtg 2755 
Lys Leu Leu lie Ser Gin Gly Thr Trp Thr Thr Gin Ala Thr Pro Val 
870 875 880 885 

acc tac aat get tta ggg ate aaa gtg aaa aat acc atg cag gtg ttt 2803 
Thr Tyr Asn Ala Leu Gly He Lys Val Lys Asn Thr Met Gin Val Phe 
890 895 900 

cct aaa gtc act etc tec tta gat tac tct gcg gat att tct tec tec 2851 
Pro Lys Val Thr Leu Ser Leu Asp Tyr Ser Ala Asp He Ser Ser Ser 
905 910 915 

acg ctg agt cac tac tta aac gtg gcg agt aga atg aga ttt 2893 
Thr Leu Ser His Tyr Leu Asn Val Ala Ser Arg Met Arg Phe 
920 925 930 

taacaataag tgaccaaaac agaaagatta aggaacctct agtgtcaaag actcctccta 2953 

agtttttatt etatcteggg aatttcacag cctgcatgtt egggatg 3000 
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Figure 24 (RY-46) 

Restriction enzyme analysis of CPN1 00628 

HaelV 

Tsp509I Hin4I Bfal Hpyl88IX Sspl 

I II II 

TAGACACTATAAAACAAATTATAGACAAAAAATCTAGCATTGATTTATTCAGAATATTTC 

1 + + + + + + 60 

ATCTGTGATATTTTGTTTAATATCTGTTTTTTAGATCGTAACTAAATAAGTCTTATAAAG 

Hpyl8 8IX 
Hhal Mwol | 

I I I 

TTTCTATTTGTGAACGAGTATGCGCTTTTTTTGCTTCGGAATGTTGCTTCCTTTTACTTT 

61 + + + + + + 120 

AAAGATAAACACTTGCTCATACGCGAAAAAAACGAAGCCTTACAACGAAGGAAAATGAAA 

Bce83I 
Hpyl78III 
Msel | 

Bsal Bsal Cjel| | 

CviJI Cjel BsmAI BsmAI Mmel | j j 

I I I I I II I I 
TGTATTGGCTAATGAAGGTCTCCAACTTCCTTTGGAGACCTATATTACATTAAGTCCTGA 

12 i + + + + + + 180 

ACATAACCGATTACTTCCAGAGGTTGAAGGAAACCTCTGGATATAATGTAATTCAGGACT 

Muni 
Tsp509I 

Mnll Sthl32I 
Tthlllll | Dpnl 
Bbvl Bglll | 

CviJI BslI BstYI j 

Fnu4HI | EcoNI | Sau3AI j 

Tsel| jsmll | j Hpyl78IIl| j 

II I I I II I II I I 
ATATCAAGCAGCCCCTCAAGTAGGGTTTACTCATAACCAAAATCAAGATCTCGCAATTGT 

181 + + + + + + 240 

TATAGTTCGTCGGGGAGTTCATCCCAAATGAGTATTGGTTTTAGTTCTAGAGCGTTAACA 

Rsal 

Hinfl Seal BsiEI 

Tfil TatI | Mnll BsiHKAI 

Hpyl78III | Taqll | j TaqI Bspl286I 

I I III I I 

CGGGAATCACAATGATTTCATCTTGGACTATAAGTACTATCGGTCGAATGGAGGTGCTCT 

241 + + + + + + 300 

GCCCTTAGTGTTACTAAAGTAGAACCTGATATTCATGATAGCCAGCTTACCTCCACGAGA 
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Hpyl8 8IX 
Dpnl 
Sau3AI | 
Hpyl8 8IX| | 
Hinfl | | | 

MboII Tfil j I 

II II 



Bbsl 
MboII 



XmnI BsmFI 

I I 

TACCTGTAAGAATCTTCTGATCTCTGAAAATATAGGGAATGTCTTCTTTGAGAAGAATGT 

301 + + + + + + 360 

ATGGACATTCTTAGAAGACTAGAGACTTTTATATCCCTTACAGAAGAAACTCTTCTTACA 



BslI 
Faul 
Sthl32I | 
Tsp509I j 
MboII | | 
I II 



Fnu4HI 
Tsel 
BstAPI 
Tsp509I | 
Bbvl | j 
Acil| j Mwol 

I I 



Dpnl 
Sau3AI | 
CviRI | | 
Tsp509I | j j 

' I I 



Hpyl88IX 
Hinfl | 
Tfil j 

I I 



CTGTCCCAATTCTGGCGGGGCAATTTATGCTGCTCAAAATTGCACGATCTCCAAGAATCA 

361 + + + + + + 420 

GACAGGGTTAAGACCGCCCCGTTAAATACGACGAGTTTTAACGTGCTAGAGGTTCTTAGT 



Dpnl 
Sau3AI 
Taqll 
TspRI | 
Acil 
BtsI | 
AlwNI | 
CjePI 
CviJI | 

Bsal Faul | j 

Nsil BsmAI Sthl32I | j j 

CviRI | Hpyl88IX| Sfcl Ml 

II III I I I I 

GAACTATGCATTTACTACAAACTTGGTCTCTGACAATCCTACAGCCACTGCGGGATCACT 

421 + + + + + + 480 

CTTGATACGTAAATGATGTTTGAACCAGAGACTGTTAGGATGTCGGTGACGCCCTAGTGA 



Banll 
BsiHKAI 
Bspl286I 
Sad 
Alul | CjePI 
CviJI j Mwol |Tsp509I 
I II I 



BslI 

ECONI | 

Bfal I j 

Avrll| j j 

BsaJlj j j 

Styl| | | 
II ' 



ATTGGGTGGAGCTCTCTTTGCCATAAATTGCTCTATTACTAATAACCTAGGACAGGGAAC 

481 + + + + + + 540 

TAACCCACCTCGAGAGAAACGGTATTTAACGAGATAATGATTATTGGATCCTGTCCCTTG 
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Dral 
Msel | 

II 



BsmAI 
BsmBI 
Bspl286I 
BseMII I 
Bmgl j 
BseSI | 
NlaIV| 
BanI | | 
III 



Mnll 
Ddel | 

I I 



TTTCGTTGACAATCTCGCTTTAAATAAGGGGGGTGCCCTCTATACTGAGACGAACTTATC 

541 + + + + + + 600 

AAAGCAACTGTTAGAGCGAAATTTATTCCCCCCACGGGAGATATGACTCTGCTTGAATAG 



Hinf I 
Bcgl 



BsaBI 
Sthl32I 
Dpnl | 
Sau3AI | | 
CviJI | | | 
Haelll | | | 
Sau96l| | j | 
II I 



Hpyl8 8IX 
Apol 



Tf il 

I 



Tsp509I 
Tthlllll 
Bspl286I | 
Bmgl | | 
BseSI j j 
II 



Mnll 
Taal 
I 



TATTAAAGACAATAAAGGCCCGATCATAATCAAGCAGAATCGGGCACTAAATTCGGACAG 

601 + + + + + + 660 

ATAATTTCTGTTATTTCCGGGCTAGTATTAGTTCGTCTTAGCCCGTGATTTAAGCCTGTC 



Alul 
CviJI 

Bcgl Hpyl78III | 

CjePI I CjePI Apol | | 

Mnll | | BseRI Mnll | Tsp509I j | 

III I II III 

TTTAGGAGGAGGGATTTATAGTGGGAACTCTCTAAATATAGAGGGAAATTCTGGAGCTAT 

661 + + + + + + 720 

AAATCCTCCTCCCTAAATATCACCCTTGAGAGATTTATATCTCCCTTTAAGACCTCGATA 



Bpml 
MboII | 
Dpnl | j 
Eco57I | | j 
Sau3AI j | j 
' I I ' 



AlwNI 
Mnll 
Dpnl | 
Tthlllll j 

BstYI 
Sau3AI 
Earl | 
Hpyl78III j 

II II 



Sau3AI 
Hpyl78III | 
BseRI | | 

II 



ACAGATCACAAGCAACTCTTCAGGATCTGGGGGAGGCATATTTTCTACCCAAACACTCAC 

721 + + + + + + 780 

TGTCTAGTGTTCGTTGAGAAGTCCTAGACCCCCTCCGTATAAAAGATGGGTTTGTGAGTG 
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TaqI Apol 
Tthlllll | Tsp509I Thai 

Dpnl | | Mnll TspRI Acil | 

I I I I III 

GATCTCCTCGAATAAAAAACTCATAGAAATCAGTGAAAATTCCGCGTTCGCAAATAACTA 

781 + + + + + + 840 

CTAGAGGAGCTTATTTTTTGAGTATCTTTAGTCACTTTTAAGGCGCAAGCGTTTATTGAT 



TaqI 
Dpnl | 



BslI 
Mnll 
ScrFI 
Mnll 
EcoRII | 
Mnll | | 



Sau3AI | | Alwl | | | 

III I I II 



BseRI 
BseRI I 



TGGATCGAACTTCAATCCAGGAGGAGGAGGTCTTACTACCACCTTTTGCACGATATTGAA 



ACCTAGCTTGAAGTTAGGTCCTCCTCCTCCAGAATGATGGTGGAAAACGTGCTATAACTT 

CviJI Taal CviJI 

BslI Rsal Msel CjePI | Xcml |NlaIV| Mwol 

III II I I II I 

CAACCGAGAAGGGGTACTCTTTAACAATAACCAAAGCCAGAGCAACGGTGGAGCCATTCA 

GTTGGCTCTTCCCCATGAGAAATTGTTATTGGTTTCGGTCTCGTTGCCACCTCGGTAAGT 



Aval 
Mnll 
TspRI 
CviRI | 

CjePI Avail BstZ17I Dral BtsI | j 

Nlalll I Sau96I AccI | Msel | Sthl32l| j | 

|| I II II II I I 

TGCGAAATCTATCATTATCAAAGAAAATGGTCCTGTATACTTTTTAAATAACACTGCAAC 



ACGCTTTAGATAGTAATAGTTTCTTTTACCAGGACATATGAAAAATTTATTGTGACGTTG 



Banll 
Bspl286I 
CviJI | 
Hin4l| | 
BseRI | | | 

I II 



BspMI 
I 



Hpyl78III 
BpH | 
Mnll | AlwNI | 
I I I 



Alul 
CviJI 
Hindlll | 
I I 



TCGGGGAGGGGCTCTCCTCAACTTATCAGCAGGTTCTGGAAACGGAAGCTTCATCTTATC 



AGCCCCTCCCCGAGAGGAGTTGAATAGTCGTCCAAGACCTTTGCCTTCGAAGTAGAATAG 
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PstI 
CviRI | 
I 



Hin4I 
I 



Msel 
I 



CviJI 
Haelll 



Beef I 
Nlalll | 
Nspl j 
SphI j 
Mnll | | 
Cac8I | j |TthlllII 
I I I I I 



TGCAGATAATGGAGATATTATCTTTAACAATAATACGGCCTCCAAGCATGCCCTCAATCC 



ACGTCTATTACCTCTATAATAGAAATTGTTATTATGCCGGAGGTTCGTACGGGAGTTAGG 



Mnll Mnll 
I I 



Hinfl 
Taql | 
Plel | | 
I I 



CviRI 
BsmFI | 
Hinfl | | 
Tfil j j 

I II 



Banll 
BscGI 
Bspl286I Mspl 
CviJI | Neil 
NlaIV| jscrFI 
II I I 



TCCATACAGAAACGCCATTCACTCGACTCCTAATATGAATCTGCAAATAGGAGCCCGTCC 



AGGTATGTCTTTGCGGTAAGTGAGCTGAGGATTATACTTAGACGTTTATCCTCGGGCAGG 



Taql 
Sthl32I | 
CviJI | | 
Sthl32I | j | 

I I II 



Dpnl 
Sau3AI | 
Alwl | | 
I I 



Banll 
BsiHKAI 
Bspl286I 
Sad 
Alul 
CviJI 
BsaXI 
Nlalll 
Alol | 
Ppil| 
II 



CGGCTATCGAGTGCTGTTCTATGATCCCATAGAACATGAGCTCCCTTCCTCCTTCCCCAT 

120 1 + + + + + + 1260 

GCCGATAGCTCACGACAAGATACTAGGGTATCTTGTACTCGAGGGAAGGAGGAAGGGGTA 



Rsal 

Mspl Nlalll | 

BsaWl| Nspl I 

BsrFlj BsrGl| 

PinAI j TatI | 

Tsp509I NspV | j Taal Af 1III | | 

Msel | Taql || Rsal | BspLUllI || 

II I II II IN 

ACTCTTTAATTTCGAAACCGGTCATACAGGTACAGTTTTATTTTCAGGGGAACATGTACA 



TGAGAAATTAAAGCTTTGGCCAGTATGTCCATGTCAAAATAAAAGTCCCCTTGTACATGT 
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Apol 
Tsp509I 



BslI 
Msel | 



EcoNl|| Hpyl88IX Maell 

I III II 

CCAGAACTTTACCGATGAAATGAATTTCTTTTCCTATTTAAGGAACACTTCGGAACTACG 

1321 + + + + + + 1380 

GGTCTTGAAATGGCTACTTTACTTAAAGAAAAGGATAAATTCCTTGTGAAGCCTTGATGC 

MboII 
Cac8I 
CviJI | 
Hael | 
Haelll | 
Cac8I 
Mwol 



Bed 
Faul | 
Sthl32I | 



CviJI | 
MboII | | 



Mnll 
Mnll | 
I 



Hinfl Plel | | |AciI | | | | 

I I II I I I III 

TCAAGGAGTCCTTGCTGTTGAAGATGGTGCGGGGCTGGCCTGCTATAAGTTCTTCCAACG 

1381 + + + + + + 1440 

AGTTCCTCAGGAACGACAACTTCTACCACGCCCCGACCGGACGATATTCAAGAAGGTTGC 



Hpyl78III 
Dpnl | 

Mmel Bell | | Hinfl 

Bfal | Sau3AI j I Tfil 

BseRI | | Acil | | j HphI Fokl | 

III I I I I I II 

AGGAGGCACTCTACTTCTAGGTCAAGGTGCGGTGATCACGACAGCAGGAACGATTCCCAC 

1441 + + + + + + 1500 

TCCTCCGTGAGATGAAGATCCAGTTCCACGCCACTAGTGCTGTCGTCCTTGCTAAGGGTG 



Mnll 

Cjel RleAI | 
Bed | Bsbl | j 
II Ml 



Taal 
I 



Rsal 
Seal 
Tat I | 
I 



BsrDI 
CjePI 
Dral | 
Msel | j 
Cjel || | 
I II I 



ACCATCCTCAACACCAACGACAGTAGGAAGTACTATAACTTTAAATCACATTGCCATTGA 

1501 + + + + + + 1560 

TGGTAGGAGTTGTGGTTGCTGTCATCCTTCATGATATTGAAATTTAGTGTAACGGTAACT 



BstXI 
BseMII | 

BpulOI Apol j 

Ddel Tsp509I 
Alul| NlalV | 
CjePI CviJI j CviJI | j 

I II II I 

CCTTCCTTCTATTCTTTCTTTTCAAGCTCAGGCTCCAAAAATTTGGATTTACCCCACAAA 

GGAAGGAAGATAAGAAAGAAAAGTTCGAGTCCGAGGTTTTTAAACCTAAATGGGGTGTTT 
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Alwl 

RleAI I Mmel 

Dpnl j Eco57I HphI | 

BstYI | | Hinfl Sthl32l| Hpyl78III | 

Sau3AI I Tfil MboII | j Ddel | j BseMII 

I I I I I II I I I I 

AACAGGATCTACCTATACTGAAGATTCCAACCCGACAATCACAATCTCAGGAACTCTCAC 

1621 + + + + + + 1680 

TTGTCCTAGATGGATATGACTTCTAAGGTTGGGCTGTTAGTGTTAGAGTCCTTGAGAGTG 



Dpnl 
BstYI | 
Sau3AI 
Hpyl78III | 
MboII 
Pflll08I 



Dpnl 
BstYI | 
Sau3AI j 
Alwl | | 

I I I 



Hpyl78III 
Alwl Smll| 
I II 



CTTACGCAACAGCAACAACGAAGATCCCTACGATAGTCTGGATCTCTCGCACTCTCTTGA 



GAATGCGTTGTCGTTGTTGCTTCTAGGGATGCTATCAGACCTAGAGAGCGTGAGAGAACT 



Bbvl 
Bsgl| 
Bce83I | | 
Bcgl | j j TaqI 
I III I 



Bcgl 

CviRI Msel | 

Fnu4HI | MboII | j 
Tsel| |Tsp509I | | 
III III 



GAAAGTTCCCCTTCTTTATATTGTCGATGTCGCTGCACAAAAAATTAACTCTTCGCAACT 

174 1 + + + + + + 1800 

CTTTCAAGGGGAAGAAATATAACAGCTACAGCGACGTGTTTTTTAATTGAGAAGCGTTGA 



BsrI 

Dpnl Apol 

BstYI | Tsp509I 

Sau3AI | Alwl Msel | 

II I I ' 



Hindi 
SfaNI 
AccI | 
Taql| 
Sail | | 
III 



1801 



GGATCTATCCACATTAAATTCTGGCGAACACTATGGGTATCAAGGCATCTGGTCGACCTA 
CCTAGATAGGTGTAATTTAAGACCGCTTGTGATACCCATAGTTCCGTAGACCAGCTGGAT 



Bbvl 
Hhal | 

Hpyl78III Bfal Thai j 

I I I I 

TTGGGTAGAAACTACAACAATCACGAACCCTACATCTCTACTAGGCGCGAATACAAAACA 

186 l + + + + + + 1920 

AACCCATCTTTGATGTTGTTAGTGCTTGGGATGTAGAGATGATCCGCGCTTATGTTTTGT 



SUBSTITUTE SHEET (RULE 26) 



WO 00/24765 



PCT/CA99/00992 



Fig. 24 (con't) 



Fnu4HI 
Alul | 

CviJI j 
Tsel j 

II 



BseRI 
CviRI | 
II 



Fokl 
Bfal 
Bsal | 
BsmAI | 
BsrI | j 



Taal 
Mnll | 
CviJI | | 



TaqI 
Sthl32I 
Maell 
Mnll | 
Mnll | j 
Hpyl78III | | I 
I III 



CAAGCTGCTCTATGCAAACTGGTCTCCTCTAGGCTACCGTCCTCATCCCGAACGTCGAGG 

192 i + + + + + + 1980 

GTTCGACGAGATACGTTTGACCAGAGGAGATCCGATGGCAGGAGTAGGGCTTGCAGCTCC 

BseRI 
Beef I | 
Hinf I 

Apol CjePI 
EcoRI CviJI CviRI 

Tsp509I BseRI BsmI Mwol | Plel | 

I I I I I I I 

AGAATTCATTACGAATGCCTTGTGGCAATCGGCATATACGGCTCTTGCAGGACTCCACTC 

1981 + + + + + + 2040 

TCTTAAGTAATGCTTACGGAACACCGTTAGCCGTATATGCCGAGAACGTCCTGAGGTGAG 



Earl 
Mnll | 
ScrFI I j 
BsaJI | | j CjePI 
BsaXI EcoRII | j j | Mnll | 

Mill I ' 



Alul 
CviJI 
Fnu4HI 
CviRI 
Nlalll 
Tsel 
MboII | 
Fokl | | 
SimI j j 
I I I 



Bbsl 
MboII 
Bbvl | 

I I 



CCTCTCCTCCTGGGATGAAGAGAAGGGTCATGCAGCTTCCCTACAAGGCATTGGTCTTCT 

2041 + + + + + + 2100 

GGAGAGGAGGACCCTACTTCTCTTCCCAGTACGTCGAAGGGATGTTCCGTAACCAGAAGA 



Msel 

Drdll Taal | PflllOSI Ndel 

I II II 

GGTTCATCAAAAAGACAAAAACGGTTTTAAGGGATTTCGTAGTCATATGACAGGTTATAG 

2ioi + + + + + + 2160 

CCAAGTAGTTTTTCTGTTTTTGCCAAAATTCCCTAAAGCATCAGTATACTGTCCAATATC 

Apol 
Tsp509I 
Hpyl88IX | 

Mnll | j Ddel 
MboII Earl | | j MboII | 

I I I I I M 

TGCTACCACCGAAGCAACCTCTTCTCAAAGTCCGAATTTCTCTTTAGGATTTGCTCAGTT 

2161 + + + + + + 2220 

ACGATGGTGGCTTCGTTGGAGAAGAGTTTCAGGCTTAAAGAGAAATCCTAAACGAGTCAA 
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Alul Hinfl CjePI 

CviJI Nlalll HphI | Xcml 

BseMII I Tfil Maell j Mnll | 

I I I I I I I 

CTTCTCCAAAGCTAAAGAACATGAATCTCAAAATAGCACGTCCTCTCACCACTATTTCTC 

2221 + + + + + + 2280 

GAAGAGGTTTCGATTTCTTGTACTTAGAGTTTTATCGTGCAGGAGAGTGGTGATAAAGAG 

MboII 
CjePI | 

V178III CviRI | | Earl Mae I I BsmAI 

I I I I I I I 

TGGAATGTGCATAGCAAAATACTCTCTTCAAAGAGTGATACGTCTATCTGTGTCTCTTGC 

2281 + + + + + + 2340 

ACCTTACACGTATCGTTTTATGAGAGAAGTTTCTCACTATGCAGATAGACACAGAGAACG 



Cjel 
Bsal 
BsmAI 

Cjel ScrFI 
Hpyl8 8IX| EcoRII | 

BsaJI I Mnll SimI | j | Ddel 

I II I I I I I I 

TTATATGTTTACCTCGGAACATACCCATACAATGTATCAGGGTCTCCTGGAAGGGAACTC 



AATATACAAATGGAGCCTTGTATGGGTATGTTACATAGTCCCAGAGGACCTTCCCTTGAG 
Banll 

Dpnl Bspl286I 
BstYI | BseMII BpulOI CviJI | 

Sau3AI j Alwl| Ddel BslI | j BsmFI 

I I II I I I I I 

TCAGGGATCTTTCCACAACCATACCTTAGCAGGGGCTCTCTCCTGTGTTTTCTTACCTCA 

n + + + + + + : 

AGTCCCTAGAAAGGTGTTGGTATGGAATCGTCCCCGAGAGAGGACACAAAAGAATGGAGT 



Mnll 
Hinfl | 



Dpnl 
Beef I ' 
Bglll | 
BstYI | 
PstI | 
Sau3AI j 
CviRI | | 
Plel j j 
I Sfcl | || 
II I I II 



Bbvl 
BsaJI 
Hpyl8 8IX 
Bed 
CviJI 
Mnll 
BpulOI | 
Ddel | 
I 



ACCTCACGGCGAGTCCCTGCAGATCTATCCCTTTATTACTGCCTTAGCCATCCGAGGAAA 



TGGAGTGCCGCTCAGGGACGTCTAGATAGGGAAATAATGACGGAATCGGTAGGCTCCTTT 
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Sthl32I 
Hpyl78III 

Bsal 
BsmAI 
Hinf I | 
Tf il j 



Tsel |Hpyl78III | | 
II I II 



Apol 
Tsp509I 
Hpyl78III | 
BslI | j 
Aval | | 
Nlalll | | | 
III I 



TCTTGCTGCGTTTCAAGAATCTGGAGACCATGCTCGGGAATTTTCCCTACACCGCCCCCT 
AGAACGACGCAAAGTTCTTAGACCTCTGGTACGAGCCCTTAAAAGGGATGTGGCGGGGGA 



BsmAI 
BsmBI 
Aatll 
BsaHI | 
Maell | 
BslI | | 
I I 



Hinf I Hhal 
Tfil Thai | 
Sfcl Mnll | Acil | | 

I II 



Drdll MboII 



AACGGACGTCTCCCTCCCTGTAGGAATCCGCGCTTCTTGGAAGAACCACCACCGAGTTCC 
TTGCCTGCAGAGGGAGGGACATCCTTAGGCGCGAAGAACCTTCTTGGTGGTGGCTCAAGG 



CviJI 

BslI | Apol 

Bfal | I Tsp509I 

III I 



Hpyl78III 
Dpnl | 
BstYI | j 
Sau3AI j j 
Alwl | j | 

Sfcl | Ml 
I I I 



CCTAGTCTGGCTCACAGAAATTTCCTATCGCTCTACTCTCTATAGGCAAGATCCTGAACT 

+ + + + + + : 

GGATCAGACCGAGTGTCTTTAAAGGATAGCGAGATGAGAGATATCCGTTCTAGGACTTGA 



Tsp509I 
TaqI | Cjel 
I I 



BsaJI 
Styl 
CviJI | 
II 



BsaAI 
Maell | 
Rsal | | 

I I 



Hgal | 
CviJI 
Hael 
Haelll 
Cac8I | 
I I 



Maelll 
Tsp45I 
MslI 
Cjel 



CCACTCGAAATTACTGATTAGCCAAGGTACGTGGACGACGCAGGCCACTCCTGTGACCTA 
GGTGAGCTTTAATGACTAATCGGTTCCATGCACCTGCTGCGTCCGGTGAGGACACTGGAT 
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Aarl 
MslI | 

Dpnl BspMI CviRI | j Maelll 

Sau3AI | Alwl | Nlalllj | Tsp45I 

I I I I III I 

CAATGCTTTAGGGATCAAAGTGAAAAATACCATGCAGGTGTTTCCTAAAGTCACTCTCTC 

2761 + + + + + + 2820 

GTTACGAAATCCCTAGTTTCACTTTTTATGGTACGTCCACAAAGGATTTCAGTGAGAGAG 

Cjel 

Maelll CjePI 
BseMII Tsp45I Bsp24l| 

Bsp24I | Hinf I | Maell 

MboII CjePI j Mnllj Msel | 

Ddel Acil| Cjel | j Ddel || Plel | | 

I II II I III Ml 

CTTAGATTACTCTGCGGATATTTCTTCCTCCACGCTGAGTCACTACTTAAACGTGGCGAG 

2821 + + + + + + 2880 

GAATCTAATGAGACGCCTATAAAGAAGGAGGTGCGACTCAGTGATGAATTTGCACCGCTC 

Mnll 
Plel | 

Maelll BseRl| j 

Msel Tsp45I Msel NlalV Bfal | | | 

I I I I I II I 

TAGAATGAGATTTTAACAATAAGTGACCAAAACAGAAAGATTAAGGAACCTCTAGTGTCA 

2881 + + + + + + 2940 

ATCTTACTCTAAAATTGTTATTCACTGGTTTTGTCTTTCTAATTCCTTGGAGATCACAGT 

Hpyl78III 
Nlalll | 
Apol CviRI | 

Tsp509I Cac8I | | 

Sthl32I Hpyl78III | Sthl32I j j 

Hinf I Ddel Mnll | Aval | j CviJI | |NspI 

I I II III IN I I 

AAGACTCCTCCTAAGTTTTTATTCTATCTCGGGAATTTCACAGCCTGCATGTTCGGGATG 

2941 + + + + + + 3000 

TTCTGAGGAGGATTCAAAAATAAGATAGAGCCCTTAAAGTGTCGGACGTACAAGCCCTAC 
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Figure 25: 

cactgtggat gtgatattcg cagaacctcc cgtcaaatat actctagata taggaagcaa 60 

attacgattt taaaccttat ttaacgacag ggttgaggc atg cct ctt tct ttc 114 

Met Pro Leu Ser Phe 
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65 
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Ser 
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Ser 
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Ser 
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Ser 
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95 
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145 
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150 


act 

Thr 


acg 

Thr 


teg 
Ser 


gca 
Ala 


act 
Thr 
155 


ccc 

Pro 


gca 
Ala 


ate 
He 


act 

Thr 


aca 

Thr 
160 


gta 
Val 


act 

Thr 


aca 

Thr 


gga 
Gly 
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165 


tct 
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Ser 


get 
Ala 
Ala 


etc 
Leu 
Leu 


caa 
Gin 
Gin 


cct 
Pro 
Pro 
170 


aca 
Thr 
Thr 


gac 
Asp 
Asp 


tea 
Ser 
Ser 


etc 
Leu 
Leu 


act 
Thr 
Thr 
175 
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Val 
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Glu 
Glu 


aac 
Asn 
Asn 


ata 
He 
He 


tec 
Ser 
Ser 
180 


caa 
Gin 
Gin 


teg 
Ser 
Ser 


ate 
He 
He 


aag 
Lys 
Lys 


ttt 
Phe 
Phe 
185 


ttt 
Phe 
Phe 


ggg 

Gly 
Gly 


aac 
Asn 
Asn 


ctt 
Leu 
Leu 


gcc 
Ala 
Ala 
190 


aac 
Asn 
Asn 


ttc 
Phe 
Phe 


ggc tct gca 
Gly Ser Ala 
Gly Ser Ala 
195 


att 
lie 
He 


age 
Ser 
Ser 
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Fig. 25 (con't) 

agt tct ccc acg gca gtc gtt aaa ttc ate aat aac acc get acc atg 73 
Ser Ser Pro Thr Ala Val Val Lys Phe He Asn Asn Thr Ala Thr Met 
Ser Ser Pro Thr Ala Val Val Lys Phe He Asn Asn Thr Ala Thr Met 

200 205 210 

age ttc tec cat aac ttt act teg tea gga ggc ggc gtg att tat gga 78 
Ser Phe Ser His Asn Phe Thr Ser Ser Gly Gly Gly Val He Tyr Gly 
Ser Phe Ser His Asn Phe Thr Ser Ser Gly Gly Glv Val He Tvr niv 
215 220 225 

gga age tct etc ctt ttt gaa aac aat tct gga tgc ate ate ttc acc 83< 

Gly Ser Ser Leu Leu Phe Glu Asn Asn Ser Gly Cys He He Phe Thr 

Gly Ser Ser Leu Leu Phe Glu Asn Asn Ser Gly Cys He He Phe Thr 

230 235 _ ' 240 2 45 

gcc aac tec tgt gtg aac age tta aaa ggc gtc acc cct tea tea gga 882 

Ala Asn Ser Cys Val Asn Ser Leu Lys Gly Val Thr Pro Ser Ser Gly 

Ala Asn Ser Cys Val Asn Ser Leu Lys Gly Val Thr Pro Ser Se- Gly 

250 255 260 

acc tat get tta gga agt ggc gga gcc ate tgc ate cct acg gga ac* 9~0 
Thr Tyr Ala Leu Gly Ser Giy Gly Ala He Cys He Pro Th- Gly Th- 
Thr Tyr Ala Leu Gly Ser Gly Gly Ala He Cvs He Pro Thr Gly Th- 
265 270 * 275 

ttc gaa tta aaa aac aat cag ggg aag tgc acc ttc tct tat aat ggt 978 

Phe Glu Leu Lys Asn Asn Gin Giy Lys Cys Thr Phe Ser Tvr Asn Glv 

Phe Glu Leu Lys Asn Asn Gin Gly Lys Cys Thr Phe Ser Ty*- Asn Glv 
280 285 290 

aca cca aat gat gcg ggt gcg ate tac gcc gaa acc tgc aac ate gta 102 

Thr Pro Asn Asp Ala Gly Ala He Tyr Ala Glu Thr Cys Asn He Val 

Thr Pro Asn Asp Ala Gly Ala He Tyr Ala Glu Thr Cys Asn lie Val 
295 300 305 

ggg aac cag ggt gcc ttg etc eta gat age aac act gca gcg aga aat 107< 

Gly Asn Gin Gly Ala Leu Leu Leu Asp Ser Asn Thr Ala Ala Arg Asn 

Gly Asn Gin Giy Ala Leu Leu Leu Asp Ser Asn Thr Ala Ala Arq Asn 

310 315 320 325 

ggc gga gcc ate tgt get aaa gtg etc aat att caa gga cgc ggt cct 1122 

Gly Gly Ala He Cys Ala Lys Val Leu Asn He Gin Gly Arg Gly Pro 

Gly Gly Ala He Cys Ala Lys Val Leu Asn He Gin Gly Arg Gly p-o 
330 335 340 

att gaa ttc tct aga aac cgc gcg gag aag ggt gga get att ttc ata 1170 

He Glu Phe Ser Arg Asn Arg Ala Glu Lys Gly Glv Ala He Phe He 

He Glu Phe Ser Arg Asn Arg Ala Glu Lys Gly Gly Ala He Phe He 
345 350 355 

ggc ccc tct gtt gga gac cct gcg aag caa aca teg aca ctt acg att 12 3 

Gly Pro Ser Val Gly Asp Pro Ala Lys Gin Thr Ser Thr Leu Thr He 

Gly Pro Ser Val Gly Asp Pro Ala Lys Gin Thr Ser Thr Leu Thr Ho 
360 365 370 

ttg get tec gaa ggt gat att gcg ttc caa gga aac atg etc aat aca 1?66 

Leu Ala Ser Glu Gly Asp He Ala Phe Gin Gly Asn Met Leu Asn Th- 

Leu Ala Ser Glu Gly Asp He Ala Phe Gin Gly Asn Met Leu Asn Thr 

375 380 385 
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Fig. 25 (con't) 

aaa cct gga ate cgc aat gec ate act gta gaa gca ggg gga gag att 

Lys Pro Gly He Arg Asn Ala He Thr Val Giu Ala Gly Gly Glu He 

Lys Pro Gly He Arg Asn Ala He Thr Val Glu Ala Gly Gly Glu He 

390 395 400 405 

gtg tct eta tct gca caa gga ggc tea cgt ctt gta ttt tat gat ccc 
Val Ser Leu Ser Ala Gin Gly Gly Ser Arg Leu Val Phe Tyr Asp Pro 
Val Ser Leu Ser Ala Gin Gly Gly Ser Arg Leu Val Phe Tyr Asp Pro 

410 415 420 

att aca cat age etc cca acc aca agt ccg tct aat aaa gac att aca 

He Thr His Ser Leu Pro Thr Thr Ser Pro Ser Asn Lys Asp He Thr 

He Thr His Ser Leu Pro Thr Thr Ser Pro Ser Asn Lys Asp lie Thr 

425 430 435 

ate aac get aat ggc get tea gga tct gta gtc ttt aca agt aag gga 

He Asn Ala Asn Gly Ala Ser Gly Ser Val Val Phe Thr Ser Lys Gly 

He Asn Ala Asn Gly Ala Ser Gly Ser Val Val Phe Thr Ser Lys Gly 

440 445 450 

etc tec tct aca gaa etc ctg ttg cct gec aac acg aca act ata ctt 

Leu Ser Ser Thr Glu Leu Leu Leu Pro Ala Asn Thr Thr Thr He Leu 

Leu Ser Ser Thr Glu Leu Leu Leu Pro Ala Asn Thr Thr Thr He Leu 

455 460 465 

eta gga aca gtc aag ate get agt gga gaa ctg aag att act gac aat 

Leu Gly Thr Val Lys lie Ala Ser Gly Glu Leu Lys He Thr Asp Asn 

Leu Gly Thr Val Lys He Ala Ser Gly Glu Leu Lys He Thr Aso Asn 
470 475 480 " 485 

gcg gtt gtc aat gtt get ggc ttc get act cag ggc tea ggt cag ctt 

Ala Val Val Asn Val Ala Gly Phe Ala Thr Gin Gly Ser Glv Gin Leu 

Ala Val Val Asn Val Ala Gly Phe Ala Thr Gin Gly Ser Gly Gin Leu 

490 495 500 

acc ctg ggc tct gga gga acc tta ggg ctg gca aca ccc acg gca gca 

Thr Leu Gly Ser Gly Gly Thr Leu Gly Leu Ala Thr Pro Thr Gly Ala 

Thr Leu Gly Ser Gly Gly Thr Leu Gly Leu Ala Thr Pro Thr Gly Ala 
505 510 515 

cct gec get gta gac ttt acg att gga aag tta gca ttc gat cct ttt 

Pro Ala Ala Val Asp Phe Thr He Gly Lys Leu Ala Phe Asp Pro Phe 

Pro Ala Ala Val Asp Phe Thr He Gly Lys Leu Ala Phe Asp Pro Phe 

520 525 530 

tec ttc eta aaa aga gat ttt gtt tea gca tea gta aat gca ggc aca 

Ser Phe Leu Lys Arg Asp Phe Val Ser Ala Ser Val Asn Ala Gly Thr 

Ser Phe Leu Lys Arg Asp Phe Val Ser Ala Ser Val Asn Ala Gly Thr 

535 540 545 

aaa aac gtc act tta aca gga get ctg gtt ctt gat gaa cat gac gtt 

Lys Asn Val Thr Leu Thr Gly Ala Leu Val Leu Asp Glu His Asd Val 

Lys Asn Val Thr Leu Thr Gly Ala Leu Val Leu Asp Giu His Asd Val 

550 555 560 ' 565 

aca gat ctt tat gat atg gtg tea tta caa tct cca gta gca att cct 

Thr Asp Leu Tyr Asp Met Val Ser Leu Gin Ser Pro Val Ala He Pro 

Thr Asp Leu Tyr Asp Met Val Ser Leu Gin Ser Pro Val Ala lie Pro 

570 575 530 



137/165 



SUBSTITUTE SHEET (RULE 26) 



WO 00/24765 



PCT/CA99/00992 



Fig. 25 (con't) 

ate get gtt ttc aaa gga gca acc gtt 

He Ala Val Phe Lys Gly Ala Thr Val 

He Ala Val Phe Lys Gly Ala Thr Val 

585 590 

ggg gag att gcg act cca age cac tac 
Gly Glu He Ala Thr Pro Ser His Tyr 
Gly Glu He Ala Thr Pro Ser His Tyr 
600 605 

tac aca tgg tec cgt ccc ctg tta att 
Tyr Thr Trp Ser Arg Pro Leu Leu He 
Tyr Thr Tm Ser Arg Pro Leu Leu lie 
615 " 620 

cct gga ggt ccc tct cct age gca aat 

Pro Gly Gly Pro Ser Pro Ser Ala Asn 

Pro Gly Gly Pro Ser Pro Ser Ala Asn 

630 635 

tea gac act etc gtg cgt tct acc tat 

Ser Asp Thr Leu Val Arg Ser Thr Tyr 

Ser Asd Thr Leu Val Arg Ser Thr Tyr 
650 

gga gaa att gtc age aac age tta tgg 

Gly Glu lie Val Ser Asn Ser Leu Trp 

Gly Glu He Val Ser Asn Ser Leu Trp 

665 670 

gca ttc tct gat att etc caa gat gtt 

Ala Phe Ser Asp lie Leu Gin Asp Val 

Ala Phe Ser Asp lie Leu Gin Asp Val 

680 685 

ttg tec ata acc gcg aaa get tta gga 

Leu Ser lie Thr Ala Lys Ala Leu Gly 

Leu Ser lie Thr Ala Lys Ala Leu Gly 

695 700 

aga caa gga cat gag ggc ttt tea ggt 

Arg Gin Gly His Glu Gly Phe Ser Gly 

Arg Gin Gly His Glu Gly Phe Ser Gly 

710 715 

gcg eta tct atg aac tac acg gac cac 
Ala Leu Ser Met Asn Tyr Thr Asp His 
Ala Leu Ser Met Asn Tyr Thr Asp His 
730 

ggg cag ctt tat gga aaa act aac gec 
Gly Gin Leu Tyr Gly Lys Thr Asn Ala 
Gly Gin Leu Tyr Gly Lys Thr Asn Ala 
745 750 

tea gaa caa atg tat tta etc teg ttc 
Ser Glu Gin Met Tyr Leu Leu Ser Phe 
Ser Glu Gin Met Tyr Leu Leu Ser Phe 
760 765 
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Pro 
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Asp 
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Ser 
755 
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Arg 
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Cvs 
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2370 
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Fig. 25 (con't) 

act caa aag age gag gec tta att tec tgg aaa gca get tat ggt tat 24 66 

Thr Gin Lys Ser Glu Ala Leu lie Ser Trp Lys Ala Ala Tyr Gly Tyr 

Thr Gin Lys Ser Glu Ala Leu lie Ser Trp Lys Ala Ala Tyr Gly Tyr 
775 780 785 

tec aaa aat cac eta aat acc acc tac etc aga cct gac aaa get cca 2514 

Ser Lys Asn His Leu Asn Thr Thr Tyr Leu Arg Pro Asp Lys Ala Pro 

Ser Lys Asn His Leu Asn Thr Thr Tyr Leu Arg Pro Asp Lys Ala Pro 
790 795 800 805 

aaa tct caa ggg caa tgg cat aac aat agt tac tat gtt ctt att tct 2562 

Lys Ser Gin Gly Gin Trp His Asn Asn Ser Tyr Tyr Val Leu He Ser 

Lys Ser Gin Gly Gin Trp His Asn Asn Ser Tyr Tyr Val Leu He Ser 

810 815 820 

gca gaa cat cct ttc eta aac tgg tgt ctt ctt aca aga cct ctg get 2 610 

Ala Glu His Pro Phe Leu Asn Trp Cys Leu Leu Thr Arg Pro Leu Ala 

Ala Glu His Pro Phe Leu Asn Trp Cys Leu Leu Thr Arg Pro Leu Ala 

825 830 835 

caa get tgg gat ctt tea ggt ttt att tec gca gaa ttc eta ggt ggt 2 658 

Gin Ala Trp Asp Leu Ser Gly Phe He Ser Ala Glu Phe Leu Gly Gly 

Gin Ala Trp Asp Leu Ser Gly Phe He Ser Ala Glu Phe Leu Gly Gly 

840 845 850 

tgg caa agt aag ttc aca gaa act gga gat ctg caa cgt age ttt agt 270 6 

Trp Gin Ser Lys Phe Thr Glu Thr Gly Asp Leu Gin Arg Ser Phe Ser 

Trp Gin Ser Lys Phe Thr Glu Thr Gly Asd Leu Gin Arg Ser Phe Ser 
855 860 " 865 

aga ggt aaa ggg tac aat gtt tec eta ccg ata gga tgt tct tct caa 2754 

Arg Gly Lys Gly Tyr Asn Val Ser Leu Pro He Gly Cys Ser Ser Gin 

Arg Gly Lys Gly Tyr Asn Val Ser Leu Pro lie Gly Cys Ser Ser Gin 

870 875 880 885 

tgg ttc aca cca ttt aag aag get cct tct aca ctg acc ate aaa ctt 2802 
Trp Phe Thr Pro Phe Lys Lys Ala Pro Ser Thr Leu Thr He Lys Leu 
Trp Phe Thr Pro Phe Lys Lys Ala Pro Ser Thr Leu Thr He Lys Leu 
890 895 900 

gee tac aag cct gat ate tat cgt gtc aac cct cac aat att gtg act 2850 
Ala Tyr Lys Pro Asp lie Tyr Arg Val Asn Pro His Asn He Val Thr 
Ala Tyr Lys Pro Asp lie Tyr Arg Val Asn Pro His Asn lie Val Thr 
905 910 915 

gtc gtc tea aac caa gag age act teg ate tea gga gca aat eta cgc 2398 

Val Val Ser Asn Gin Glu Ser Thr Ser He Ser Gly Ala Asn Leu Arg 

Val Val Ser Asn Gin Glu Ser Thr Ser lie Ser Gly Ala Asn Leu Arg 

920 925 930 

cgc cac ggt ttg ttt gta caa ate cat gat gta gta gat czc acc gag 294 6 

Arg His Gly Leu Phe Val Gin lie His Asp Val Val Asd Leu Thr Glu 

Arg His Gly Leu Phe Val Gin lie His Asd Val Val Asd Leu Thr Glu 

935 "• 940 945 
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Fig. 25 (con't) 

gac act: cag gcc ttt eta aac tat acc ttt gac ggg aaa aat gga ttt 2994 

Asp Thr Gin Ala Phe Leu Asn Tyr Thr Phe Asp Gly Lys Asn Gly Phe 

Asp Thr Gin Ala Phe Leu Asn Tyr Thr Phe Asp Gly Lys Asn Gly Phe 
950 955 960 965 

aca aac cac cga gtg tct aca gga eta aaa tec aca ttt taaaactcta 3043 

Thr Asn His Arg Val Ser Thr Gly Leu Lys Ser Thr Phe 

Thr Asn His Arg Val Ser Thr Gly Leu Lys Ser Thr Phe 

970 975 

agetctgett agagttttct gtagccccgg tegtcttaga atcctctatc catcatcgaa 3103 

gaacttagca atgaaggeca agattctcac tctatgagaa ccccccc 3150 
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Figure 26 (RY-47) 

Restriction enzyme analysis of CPN1 00630 



TspRI 
Taal | 

I 



Fokl 
I 



Hpyl78III 
Mnll | 
Sthl32I Bfalj 
BscGI | Xbal | j 

I I III 



Tsp509I 
I 



CACTGTGGATGTGATATTCGCAGAACCTCCCGTCAAATATACTCTAGATATAGGAAGCAA 

GTGACACCTACACTATAAGCGTCTTGGAGGGCAGTTTATATGAGATCTATATCCTTCGTT 

Nlalll 
Nspl 

Dral SphI Mnll 

Msel| Msel Mnll Cac8I | MboII | 

II I I II II 

ATTACGATTTTAAACCTTATTTAACGACAGGGTTGAGGCATGCCTCTTTCTTTCAAATCT 

TAATGCTAAAATTTGGAATAAATTGCTGTCCCAACTCCGTACGGAGAAAGAAAGTTTAGA 



Hinf I 
Mnll 
Bfal | 
Plel | j 
Ddel | | | 
I I I I I 



BsmAI 
Hhal 
Thai 
BseMII | 

AccI CviRI Mwol | | 

I I II I 

TCATCTTTTTGTCTACTTGCCTGTTTATGTAGTGCAAGTTGCGCGTTTGCTGAGACTAGA 

12 i + + + + + + 180 

AGTAGAAAAACAGATGAACGGACAAATACATCACGTTCAACGCGCAAACGACTCTGATCT 

Hpyl8 8IX 
Tthlllll 
MboII 
HphI 
Dpnl 
Bglll 
BstYI 

Hinfl Sau3AI 
Tfil Eco57I | 

Tsp509I Mnll | Earl | j 

I II I II 

CTCGGAGGGAACTTTGTTCCTCCAATTACGAATCAGGGTGAAGAGATCTTACTCACTTCA 

181 + + + + + + 240 

GAGCCTCCCTTGAAACAAGGAGGTTAATGCTTAGTCCCACTTCTCTAGAATGAGTGAAGT 

GATTTTGTTTGTTCAAACTTCTTGGGGGCGAGTTTTTCAAGTTCCTTTATCAATAGTTCC 

241 + + + + + + 300 

CTAAAACAAACAAGTTTGAAGAACCCCCGCTCAAAAAGTTCAAGGAAATAGTTATCAAGG 



Hpyl88IX 
I 
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. 26 (con't) 



CviJI 
Haelll 
EcoO109I | 
BslI Sau96I j 
I I I 



Acll 
Maell 
Msel I 



Alul 
CviJI 
Mnll | 
II 



AGCAATCTCTCCTTATTAGGGAAGGGCCTTTCCTTAACGTTTACCTCTTGTCAAGCTCCT 

301 + + + + + + 360 

TCGTTAGAGAGGAATAATCCCTTCCCGGAAAGGAATTGCAAATGGAGAACAGTTCGAGGA 



Plel 

Acil| Apol 
Fnu4Hl| Tsp509I 
Taulj Hpyl88IX MboII | 

Maelll Hhal BsmAI | j Hinfl |Hpyl78IIl|| 

I I III I I Ml 

ACAAATAGTAACTATGCGCTACTTTCTGCCGCAGAGACTCTGACCTTCAAGAATTTTTCT 

361 + + + + + + 420 

TGTTTATCATTGATACGCGATGAAAGACGGCGTCTCTGAGACTGGAAGTTCTTAAAAAGA 



Drdll 
NlaIV| 
Bsp24I | j 
Cjel| | 
CjePl| | 
III 



TaqI 
I 



CviJI 
Haelll 
Fnu4HI | 
Taul | 
Acil | j 
II I 



Cjel 
Mnll 
CjePI | 
isp24I | j 
III 



TCTATAAACTTTACAGGGAACCAATCGACAGGACTTGGCGGCCTCATCTACGGAAAAGAT 

AGAT ATTTG AAATGT C CCTTGGTTAG CTGT C CTG AAC CGC CGGAGTAG ATG C CTTTT CTA 

Cjel 
Bsbl 

Dpnl Taal | 

Sau3AI | Bpml | I 

MboII | j Pflll08I | I I I 

I II I I I I I 

ATTGTTTTCCAATCTATCAAAGATTTGATCTTCACTACGAACCGTGTTGCCTATTCTCCA 

TAACAAAAGGTTAGATAGTTTCTAAACTAGAAGTGATGCTTGGCACAACGGATAAGAGGT 



Mae 1 1 
SfaNI | 
Maelll | | 
AlwNI | | j 
II II 



Maelll 
Sfcl 
Faul 



Sthl32I 
Acil | 
Mwol | j 
Cjel || | 

I II 



CviJI 
NlaIV| 
II 



GCATCTGTAACTACGTCGGCAACTCCCGCAATCACTACAGTAACTACAGGAGCCTCTGCT 

+ + + + + + ( 

CGTAG AC ATTG ATG C AGC CGTTG AGGG CGTT AGTG ATGTC ATTG ATGT C CT CGG AG ACG A 
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Fig. 26 (con't) 



Mmel Dpnl 

Sfcl TspRI Sau3AI | 

Plel| Taql| Clal | j 

Mnll I Hinfl Taal | | TaqI | | 

I II I I M M I 

CTCCAACCTACAGACTCACTCACTGTCGAAAACATATCCCAATCGATCAAGTTTTTTGGG 

601 + + + + + + f 

GAGGTTGGATGTCTGAGTGAGTGACAGCTTTTGTATAGGGTTAGCTAGTTCAAAAAACCC 

Beef I 

Tsp509I Apol 
CviRl| BsaJI Tsp509I 

NlalV CviJI | | BstDSI Msel | 

I I II I I I 

AACCTTGCCAACTTCGGCTCTGCAATTAGCAGTTCTCCCACGGCAGTCGTTAAATTCATC 

661 + + + + + + 

TTGGAACGGTTGAAGCCGAGACGTTAATCGTCAAGAGGGTGCCGTCAGCAATTTAAGTAG 

BsaXI 

Alul Fnu4HI 
CviJI Hpyl78III Taul 

Acil Nlalll | Mnll | Acil| Mnll 

I I I I I M I 

AATAACACCGCTACCATGAGCTTCTCCCATAACTTTACTTCGTCAGGAGGCGGCGTGATT 

721 + + + + + + 

TTATTGTGGCGATGGTACTCGAAGAGGGTATTGAAATGAAGCAGTCCTCCGCCGCACTAA 

Nsil 
HaeIV| 
Hin4I j 
HphI j 
CviRI | | 
MboII | j j Acil 
Hpyl78III | Ml Fokl | 
Alul Tsp509I | || SfaNI | 

CviJI SfaNI | I I I I |CjeI | I 

I II I I III I I I 

TATGGAGGAAGCTCTCTCCTTTTTGAAAACAATTCTGGATGCATCATCTTCACCGCCAAC 



781 



840 



ATACCTCCTTCGAGAGAGGAAAAACTTTTGTTAAGACCTACGTAGTAGAAGTGGCGGTTG 
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Fig. 26 (con't) 



Maelll 
Tsp45I 
BsaHI ' 
Cjel 
Mwol 
HphI 
Msel I 



Alul | 

CviJI | 

Hgal | | 

I I I 



NlalV 
Hpyl78III | 
BslI | | 
ECONI | j | 

I I I I 



TCCTGTGTGAACAGCTTAAAAGGCGTCACCCCTTCATCAGGAACCTATGCTTTAGGAAGT 



AGGACACACTTGTCGAATTTTCCGCAGTGGGGAAGTAGTCCTTGGATACGAAATCCTTCA 



CviRI 
Sthl32I 
Bed | 
CviJI 
NlalV 



Ecil | 
Acil| j 
Fokl | | | 
III I 



BscGI 
Sf aNI I 

II 



Msel 
Tsp509I I 
NspV | j 
Taql | | 
I I ■ 



BseSI 
CviRI 
ApaLI | 
I I 



901 



GGCGGAGCCATCTGCATCCCTACGGGAACTTTCGAATTAAAAAACAATCAGGGGAAGTGC 

+ + + + + + 

CCGCCTCGGTAGACGTAGGGATGCCCTTGAAAGCTTAATTTTTTGTTAGTCCCCTTCACG 



Faul 

Sthl32l| Mwol 
BsiHKAI Rsal | j Dpnl | 

Bspl286I SfaNI j | Acil Sau3AI | | CviRI 

| I II I I I I I 

ACCTTCTCTTATAATGGTACACCAAATGATGCGGGTGCGATCTACGCCGAAACCTGCAAC 

961 + + + + + + ; 

TGGAAGAGAATATTACCATGTGGTTTACTACGCCCACGCTAGATGCGGCTTTGGACGTTG 



NlalV 
BanI 
ScrFI 
BsaJI | 
Drdll | j 
Pf 111081 ECORII | | 
BspMI | NlalV | j j 
I I I I I I 



Bfal 
I 



PstI 
TspRI 
Fnu4HI ' 
CviRI 
Tsel 
BtsI | 
Sfcl j 
Bsbl | I 
I I I 



NlalV 
Ecil | 
Acil | | 
Bbvl | j | 
III I 



ATCGTAGGGAACCAGGGTGCCTTGCTCCTAGATAGCAACACTGCAGCGAGAAATGGCGGA 

102 i + + + + + + 1080 

TAGCATCCCTTGGTCCCACGGAACGAGGATCTATCGTTGTGACGTCGCTCTTTACCGCCT 
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Fig. 26 (con't) 



Bed 
CviJI |MwoI 

I I I 



Sspl 
BsiHKAI | 
Bspl286I j 
I 



Avail 
Sau96I 
Acil | 
Thai 



Hpyl78III 
Apol 
ECORI 
Tsp509I Bfal 

I 



Tthllll |HgaI | Xbal|| Acil 
I I I I III I 

GCCATCTGTGCTAAAGTGCTCAATATTCAAGGACGCGGTCCTATTGAATTCTCTAGAAAC 

1081 + + + + + + 1140 

CGGTAGACACGATTTCACGAGTTATAAGTTCCTGCGCCAGGATAACTTAAGAGATCTTTG 



Acil 
Hhal 
Thai 
Thai | 
I I 



Mmel 
Alul | 
CviJI j 
I I 



NlalV 
CviJI | 
Haelllj BslI 
ECOO109I | | Bsal | Mnll 
Sau96I | j BsmAI j SimI 
III II I 



CGCGCGGAGAAGGGTGGAGCTATTTTCATAGGCCCCTCTGTTGGAGACCCTGCGAAGCAA 



GCGCGCCTCTTCCCACCTCGATAAAAGTATCCGGGGAGACAACCTCTGGGACGCTTCGTT 



TaqI Tthlllll 
I I 



Hpyl88IX 
CviJI | 
I I 



BsaJI 
Styl 
HphI | 
I I 



Nlalll 
Nspl 
Cjel | 
I I 



ACATCGACACTTACGATTTTGGCTTCCGAAGGTGATATTGCGTTCCAAGGAAACATGCTC 

1201 + + + + + + 1260 

TGTAGCTGTGAATGCTAAAACCGAAGGCTTCCACTATAACGCAAGGTTCCTTTGTACGAG 

Hinfl Hin4I 

Tfil Cjel TspRI Hin4I | 

ScrFI | Bed | Taal | BsaXI | j 

EcoRII | j Acil BsrDl|Sfcl| j Bsgl| | | 

I I I I II II I M I I 

AATACAAAACCTGGAATCCGCAATGCCATCACTGTAGAAGCAGGGGGAGAGATTGTGTCT 

126 i + + + + + + 1320 

TTATGTTTTGGACCTTAGGCGTTACGGTAGTGACATCTTCGTCCCCCTCTCTAACACAGA 



Dpnl 

CviRI Mae I I Sau3AI | CviJI 

BsmAI Mnll CviJI | Alwl | j BsaXI | 

II II III M 

CTATCTGCACAAGGAGGCTCACGTCTTGTATTTTATGATCCCATTACACATAGCCTCCCA 

1321 + + + + + + 1380 

GAT AG ACGTGTT C CTC CG AGTGC AGAAC AT AAAAT ACT AGGGTAATGTGT AT CGGAGGGT 
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Fig. 26 (con't) 



Alwl 
HaelV 
Hin4I 
Sfcl| 
AlwNI | 
Dpnl I 
BstYI 
Sau3AI 
Hpyl78III | 
Haell | I 
Hhal | | j 

Mnll Eco57I Mwol || || 

I I I II II 

ACCACAAGTCCGTCTAATAAAGACATTACAATCAACGCTAATGGCGCTTCAGGATCTGTA 

1381 + + + + + + 1440 

TGGTGTTCAGGCAGATTATTTCTGTAATGTTAGTTGCGATTACCGCGAAGTCCTAGACAT 

BseRI BsmFI 

Plel|HinfI Sfcl |MnlI Cac8I Bsbl 

II I III II 

GTCTTTACAAGTAAGGGACTCTCCTCTACAGAACTCCTGTTGCCTGCCAACACGACAACT 

1441 + + + + + + 1500 

CAGAAATGTTCATTCCCTGAGAGGAGATGTCTTGAGGACAACGGACGGTTGTGCTGTTGA 



Dpnl 
Sau3AI | 
Hpyl78III 
Taal | 
CjePI | | 
Bfal | | | 
I III 



Bfal 

I 



EC057I 
Acil | 
CjePI MboII | j 
I III 



ATACTTCTAGGAACAGTCAAGATCGCTAGTGGAGAACTGAAGATTACTGACAATGCGGTT 



TATGAAGATCCTTGTCAGTTCTAGCGATCACCTCTTGACTTCTAATGACTGTTACGCCAA 



Mwol 
CviJI | 
Cac8I | j 
I I I 



Hpyl78III 
Banll 
Bspl286I 
CviJI 
Mnll | 



Banll 
Bspl286I 
BpulOI | 
Ddel j 
Ddel CviJI | |BseMII 
I III I 



ScrFI 
BsaJI 
BsaJI | 
BseMII j 
EcoRII | 
Alul | j 
CviJI j j 
II 
I ■ 



NlalV 
I 



GTCAATGTTGCTGGCTTCGCTACTCAGGGCTCAGGTCAGCTTACCCTGGGCTCTGGAGGA 



CAGTTACAACGACCGAAGCGATGAGTCCCGAGTCCAGTCGAATGGGACCCGAGACCTCCT 
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Fig. 26 (con't) 



AccI 
BspMI 
Sfcl| 
MspAlI | 
Acil 
Fnu4HI 
Taul 
Mwol | 
Taqll | 
Aarl 
BsiHKAI | 
Bspl286I j 
BSCGI 
BsaJI 
BstDSI 
Bsbl 
Bpml 
Sthl32I 
Bsu36I Cac8I | 
Ddel CviJI | | 
I III. 

ACCTTAGGGCTGGCAACACCCACGGGAGCACCTGCCGCTGTAGACTTTACGATTGGAAAG 

1621 + + + + + + : 

TGGAATCCCGACCGTTGTGGGTGCCCTCGTGGACGGCGACATCTGAAATGCTAACCTTTC 



Dpnl 
Sau3AI 
Taql| 
Alwl | | 
BsmI | j 
I I I 



Cac8I 
CviRI | 
SfaNI | j 
I I I 



TTAGCATTCGATCCTTTTTCCTTCCTAAAAAGAGATTTTGTTTCAGCATCAGTAAATGCA 



AATCGTAAGCTAGGAAAAAGGAAGGATTTTTCTCTAAAACAAAGTCGTAGTCATTTACGT 



Hpyl78III 
Drdll 





Alol 




Ppil 




Banll | 




BsaXI j 




BsiHKAI | 




Bspl286I j 


Maelll 


Saclj 


Tsp45I 


Alul | j 


Maell | 


Msel CviJI j j 



Dpnl 

Maelll Bglll | 
Maell| BstYI j 
Nlalll | |Sau3AI j 
III I I 

GGCACAAAAAACGTCACTTTAACAGGAGCTCTGGTTCTTGATGAACATGACGTTACAGAT 

+ + + + + + : 

CCGTGTTTTTTGCAGTGAAATTGTCCTCGAGACCAAGAACTACTTGTACTGCAATGTCTA 
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Fig. 26 (con't) 



Bpml 

HaelV Tsp509I 

Hin4I BsrI | 

I I I 

CTTTATGATATGGTGTCATTACAATCTCCAGTAGCAATTCCTATCGCTGTTTTCAAAGGA 

180 i + + + + + + I860 

GAAATACTATACCACAGTAATGTTAGAGGTCATCGTTAAGGATAGCGACAAAAGTTTCCT 

CviJI 
Bgll | 

Maelll Hinfl Mwol j 

Taal Bed Hin4I | Cjel | j 

BsaXI | Ddel Hpyl78III | Plel | | CviJI | j | 

III II III I I I I 

GCAACCGTTACTAAGACAGGATTTCCTGATGGGGAGATTGCGACTCCAAGCCACTACGGC 

1861 + + + + + + 1920 

CGTTGGCAATGATTCTGTCCTAAAGGACTACCCCTCTAACGCTGAGGTTCGGTGATGCCG 



Tsp509I 
Sthl32I 
BscGI 



BsaJI 
Sty I 



BsmFI 
Avail 
Sau96I 
Bcefl | 
BsmFI | 
I I 



Cjel 
NlalV 
Avail 
Nlalll 
Sau96I 
BslI | 
I I 



Acelll 
BsmFI 
BslI | 
Bed | 
EcoNI | 
Hpyl78III 
AlwNI | 
Mnll | 
Alul | | 
CviJI | j 
I II 



TACCAAGGAAAGTGGTCCTACACATGGTCCCGTCCCCTGTTAATTCCAGCTCCTGATGGA 
ATGGTTCCTTTCACCAGGATGTGTACCAGGGCAGGGGACAATTAAGGTCGAGGACTACCT 



NlalV 
Avail 
ECOO109I 
Psp5II 
Sau96I 
ScrFI I 
EcoRII | j 
Mnll I j j 

I I ■ 



Bpml 
Hhal | 
Mnll | j 
Bfal | | | 
I III 



Hpyl8 8IX 
Apol | 
EcoRI j 
Tsp509I j 
I 



GGATTTCCTGGAGGTCCCTCTCCTAGCGCAAATACTCTCTATGCTGTATGGAATTCAGAC 



CCTAAAGGACCTCCAGGGAGAGGATCGCGTTTATGAGAGATACGACATACCTTAAGTCTG 
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Fig. 26 (con't) 



BssSI 
MslI 
I 



Sthl32I 
Maelll | 
Hpyl78III 
Aval | 
Dpnl 



BstYI 
Sau3AI 
Ddel | 
Alwl | I 
I I I 



Cjel 
BsaXI | 
Hin4I | 
Tsp509I I I 

III 



ACTCTCGTGCGTTCTACCTATATCTTAGATCCCGAGCGTTACGGAGAAATTGTCAGCAAC 



TGAGAGCACGCAAGATGGATATAGAATCTAGGGCTCGCAATGCCTCTTTAACAGTCGTTG 

Alul Hpyl88IX 
CviJI Ddel Cjel BsmI | Fokl 

I I I I I I 

AGCTTATGGATTTCCTTCTTAGGAAATCAGGCATTCTCTGATATTCTCCAAGATGTTCTT 

2ioi + + + + + + 2160 

TCGAATACCTAAAGGAAGAATCCTTTAGTCCGTAAGAGACTATAAGAGGTTCTACAAGAA 



Dpnl 
HaeIV| 

Hin4l| ScrFl| 
Sau3AI | j Aval | j 
Sthl32I j |BsaJI j | 
III III 



Sthl32I 
Neil 
ScrFI 
Smal 
Mspl | 
Neil j 



Alul 
CviJI 
Hindi I I | 
Thai | j 
Acil | | | 
II II 



BsaXI 
Hin4I 
CviJI | 
NlaIV| j 
Mwol | | j TaqI 
'III I 



TTGATAGATCATCCCGGGTTGTCCATAACCGCGAAAGCTTTAGGAGCCTATGTCGAACAC 



AACTATCTAGTAGGGCCCAACAGGTATTGGCGCTTTCGAAATCCTCGGATACAGCTTGTG 



Mnll CviJI 
BslI | Nlalll | 

I ' 



CviJI 
Mwol | 
Mnll Bbvl | j 

I III 



Fnu4HI 

Alul | 
CviJI j 

Tsel |HhaI 

II ' 



ACACCAAGACAAGGACATGAGGGCTTTTCAGGTCGCTATGGAGGCTACCAAGCTGCGCTA 



TGTGGTTCTGTTCCTGTACTCCCGAAAAGTCCAGCGATACCTCCGATGGTTCGACGCGAT 



Avail 
Sau96I 
Cjel | 
I I 



Maell Sthl32I 

I I 



Alul 
CviJI 
Fnu4HI | 
Tsel | j 
Cjel || | 
' II ' 



TCTATGAACTACACGGACCACACTACGTTAGGACTTTCTTTCGGGCAGCTTTATGGAAAA 

+ + + + + + : 

AGATACTTGATGTGCCTGGTGTGATGCAATCCTGAAAGAAAGCCCGTCGAAATACCTTTT 
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Fig. 26 (con't) 

Hinfl 
Tfil Hpyl88IX 
PflllOSI |MaeII Ddel | BseMII 

I I I I I I 

ACTAACGCCAACCCCTACGATTCACGTTGCTCAGAACAAATGTATTTACTCTCGTTCTTT 

2341 + + + + + + 2400 

TGATTGCGGTTGGGGATGCTAAGTGCAACGAGTCTTGTTTACATAAATGAGAGCAAGAAA 



Hinfl 
Hpyl78III 
Maelll I 
Tsp45I j 
Plel | j 
I II 



Mnll 
I 



ScrFI 
EcoRII | 
Tsp509I 
Msel ' 



CviJI 
Hael 
Haelll 
StuI 

I 



Alul 
CviJI 
Fnu4HI | 
Tsel| | 

II 



GGTCAATTCCCTATCGTGACTCAAAAGAGCGAGGCCTTAATTTCCTGGAAAGCAGCTTAT 

2401 + + + + + + 2460 

CCAGTTAAGGGATAGCACTGAGTTTTCTCGCTCCGGAATTAAAGGACCTTTCGTCGAATA 



Alul 
CviJI 
BseMII | 

HphI Hpyl88IX Bce83I | j 

Bbvl | Ddel | Mnll | jj Smll 

II I I II II I 

GGTTATTCCAAAAATCACCTAAATACCACCTACCTCAGACCTGACAAAGCTCCAAAATCT 

2461 + + + + + + 2520 

CCAATAAGGTTTTTAGTGGATTTATGGTGGATGGAGTCTGGACTGTTTCGAGGTTTTAGA 

PstI 
CviRI | 

BsrDI Maelll Fokl Sfcl | j Alol 

I I I I I I I 

CAAGGGCAATGGCATAACAATAGTTACTATGTTCTTATTTCTGCAGAACATCCTTTCCTA 

2521 + + + + + + 2580 

GTTCCCGTTACCGTATTGTTATCAATGATACAAGAATAAAGACGTCTTGTAGGAAAGGAT 



Dpnl 
BstYI 
Sau3AI 
Alul 
CviJI 
Hindi I I 
Mnll 

Bbsl Smll | 

MboII BsrI Bce83I CviJI | | | | | Alwl Acil 

II I II I I I I I I 

AACTGGTGTCTTCTTACAAGACCTCTGGCTCAAGCTTGGGATCTTTCAGGTTTTATTTCC 

2581 + + + + + + 2640 

TTGACCACAGAAGAATGTTCTGGAGACCGAGTTCGAACCCTAGAAAGTCCAAAATAAAGG 
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Fig. 26 (con't) 

Maell 
CviRI 

Bfal Dpnl 
Avrll| Bglll 
BsaJlj Bsrl 
Apol | | BstYI | | | Alul 

EcoRI | | Sau3AI | I |CviJI 

Tsp509I Styl Cjel AlwNI | | j |Cjel| 

I II I 

GCAGAATTCCTAGGTGGTTGGCAAAGTAAGTTCACAGAAACTGGAGATCTGCAACGTAGC 

2 6 41 + + + + + + 2700 

CGTCTTAAGGATCCACCAACCGTTTCATTCAAGTGTCTTTGACCTCTAGACGTTGCATCG 



Bpml Drdll 
Mnlll Rsal MboII Fokl | 

II I I I I 

TTTAGTAGAGGTAAAGGGTACAATGTTTCCCTACCGATAGGATGTTCTTCTCAATGGTTC 

2701 + + + + + + 2760 

AAATCATCTCCATTTCCCATGTTACAAAGGGATGGCTATCCTACAAGAAGAGTTACCAAG 



NlalV Bed BsaBI 

Msel CviJI I TspRI | CviJI EcoRV | 

III II III 

ACACCATTTAAGAAGGCTCCTTCTACACTGACCATCAAACTTGCCTACAAGCCTGATATC 

2761 + + + + + + 2820 

TGTGGTAAATTCTTCCGAGGAAGATGTGACTGGTAGTTTGAACGGATGTTCGGACTATAG 



Maelll 
Tsp45I 
Mnll | 
Sspl | | 
II 



PshAI 
Taal | 

II 



BsmAI 
BsmBI 
I 



Ddel 
Dpnl 
Sau3AI 
Taql| 
BsiHKAI | | 
Bspl286I j | 
I II 



TATCGTGTCAACCCTCACAATATTGTGACTGTCGTCTCAAACCAAGAGAGCACTTCGATC 

+ + + + + + : 

ATAGCACAGTTGGGAGTGTTATAACACTGACAGCAGAGTTTGGTTCTCTCGTGAAGCTAG 



BsaJI 
BstDSI 
Tthlllll 
Acil | 
Fnu4HI | 
BseMII | j 
Cjel |Taul| 
I I II 



Rsal 
BsrGI | 

Hpyl78III Cjel |Taul| | Taal TatI | 

I I I II I I II 

TCAGGAGCAAATCTACGCCGCCACGGTTTGTTTGTACAAATCCATGATGTAGTAGATCTC 

2881 + + + + + + 2940 

AGTCCTCGTTTAGATGCGGCGGTGCCAAACAAACATGTTTAGGTACTACATCATCTAGAG 



Mnll 
Dpnl 
Bglll 
BstYI 
Cjel Sau3AI 
Nlalll HphI | 
I I 
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Fig. 26 (con't) 



CviJI 

Hael Bsp24I 

Haelll Sthl32I BscGI CjePI 

BsaJI Ddel StuI BseMII | BslI | Cjel| 

I II I I I I M 

ACCGAGGACACTCAGGCCTTTCTAAACTATACCTTTGACGGGAAAAATGGATTTACAAAC 

294i + + + + + + 3000 

TGGCTCCTGTGAGTCCGGAAAGATTTGATATGGAAACTGCCCTTTTTACCTAAATGTTTG 

Sfcl Cjel Alul 

Accl| CjePI Dral CviJI 

Drain || Bsp24l| Msel| Ddel | Ddel 

I II II II I I I 
CACCGAGTGTCTACAGGACTAAAATCCACATTTTAAAACTCTAAGCTCTGCTTAGAGTTT 
3001 + + + + + + 3060 



GTGGCTCACAGATGTCCTGATTTTAGGTGTAAAATTTTGAGATTCGAGACGAATCTCAAA 



Hinf I 
Ddel | 
Sthl32I | j 
BsiEI | | j 

Mspl Ml | CviJI 

Neil III j Hael 

ScrFI III j Haelll 

BsaJI | IN I TaqI BsrDI | 

CviJI | | III | Mnll | Mwol| | 

Sfcl | Tfil Bed | j Ddel MboII || j 

I II I I I I I III I I II I 
TCTGTAGCCCCGGTCGTCTTAGAATCCTCTATCCATCATCGAAGAACTTAGCAATGAAGG 

3061 + + + + + + 3120 



AGACATCGGGGCCAGCAGAATCTTAGGAGATAGGTAGTAGCTTCTTGAATCGTTACTTCC 



Hin4I 
Hinfl I 
Tfil j 

I I 

CCAAGATTCTCACTCTATGAGAACCCCCCC 

3121 + + + 3150 

GGTTCTAAGAGTGAGATACTCTTGGGGGGG 
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Figure 27: CPN100397 

1 MKIPLRFLLI SLVPTLSMSN LLGAATTEEL SASNSFDGTT STTSFSSKTS 
51 SATDGTNYVF KDSWIENVP KTGETQSTSC FKNDAAAGDL NFLGGGFSFT 
101 FSNIDATTAS GAAIGSEAAN KTVTLSGFSA LSFLKSPAST VTNGLGAINV 
151 KGNLSLLDND KVLIQDNFST GDGGAINCAG SLKIANNKSL SFIGNSSSTR 
201 GGAIHTKNLT LSSGGETLFQ GNTAPTAAGK GGAIAIADSG TLSISGDSGD 
251 IIFEGNTIGA TGTVSHSAID LGTSAKITAL RAAQGHTIYF YDPITVTGST 
301 SVADALNINS PDTGDNKEYT GTIVFSGEKL TEAEAKDEKN RTSKLLQNVA 
351 FKNGTWLKG DWLSANGFS QDANSKLIMD LGTSLVANTE SIELTNLEIN 
4 01 IDSLRNGKKI KLSAATAQKD IRIDRPWLA ISDESFYQNG FLNEDHSYDG 
451 ILELDAGKDI VISADSRSID AVQSPYGYQG KWTINWSTDD KKATVSWAKQ 
501 SFNPTAEQEA PLVPNLLWGS FIDVRSFQNF IELGTEGAPY EKRFWVAGIS 
551 NVLHRSGREN QRKFRHVSGG AWGASTRMP GGDTLSLGFA QLFARDKDYF 
601 MNTNFAKTYA GSLRLQHDAS LYSWSILLG EGGLREILLP YVSKTLPCSF 
651 YGQLSYGHTD HRMKTESLPP PPPTLSTDHT SWGGYVWAGE LGTRVAVENT 
701 SGRGFFQEYT PFVKVQAVYA RQDSFVELGA ISRDFSDSHL YNLAIPLGIK 
751 LEKRFAEQYY HWAMYSPDV CRSNPKCTTT LLSNQGSWKT KGSNLARQAG 
801 IVQASGFRSL GAAAELFGNF GFEWRGSSRS YNVDAGSKIK F 



Possible T cell epitope: 
516 LLWGSFIDV 
Possible B cell epitope: 
554 HRS GRENQRKFRHV 
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Figure 28: CPN10 0421 

1 MPPLNADDVL PRDHLSDGSF SDTYPDITTQ AIILIFLALS PFLVMLLTSY 
51 LKIIITLVLL RNALGVQQTP PSQVLNGIAL ILSIYVMFPT GVAMYKDARK 
101 EIEANTIPQS LFTAEGAETV FVALNKSKEP LRSFLIRNTP KAQIQSFYKI 
151 SQKTFPSEIR AHLTASDFVI IIPAFIMGQI KNAFEIGVLI YLPFFVIDLV 
201 TANVLVAMQM MMLSPLSISL PLKLLLIVMV DGWTLLLQGL MISFK 



Possible T cell epitope: 



188 VLIYLPFFV 



Possible B cell epitope: 
125 NKSKEPLR 



154/165 
SUBSTITUTE SHEET (RULE 26) 



WO 00/24765 



PCT/CA99/00992 



Figure 29: CPN100422 

1 MKFFSLIFKD DDVSPNKKVL 
51 QKCAQIRQEA KDQGFKEGSE 
101 SVRKIIGKEL ELHPETIVSI 
151 PELKNIVEYA DSLILTAKPD 
2 01 STILKAKNPV DEPSETSSST 



SPEAFSAFLD AKELLEKTKA DSEAYVAETE 
SWSKQIAFLE EETKNLRIRV REALVPLAIA 
ISQALKELTQ NKHIIISVNP KDLPLVEKSR 
VTPGGCIIET EAGIINAQLD VQLDALEKAF 
DSSSLSNDQD KKE 



Possible T cell epitope: 
163 LILTAKPDV 

Possible B cell epitope: 
22 6 SNDQDKKE 
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Figure 30: CPN100424 



1 MTLLCCTSCN SRSLIVHGLP GREANEIWL LVSKGVAAQK LPQAAAATAG 
51 AATEQMWDIA VPSAQITEAL AILNQAGLPR MKGTSLLDLF AKQGLVPSEL 
101 QEKIRYQEGL SEQMASTIRK MDGWDASVQ ISFTTENEDN LPLTASVYIK 
151 HRGVLDNPNS IMVSKIKRLI ASAVPGLVPE NVSWSDRAA YSDITINGPW 

2 01 GLTEEIDYVS WGIILAKSS LTKFRLIFYV LILILFVISC GLLWVIWKTH 
251 TLIMTMGGTK GFFNPTPYTK NALEAKKAEG AAADKEKKED ADSQGESKNA 

3 01 ETSDKDSSDK DAPEGSNEIE GA 



Possible T cell epitope: 
201 GLTEEIDYV 
Possible B cell epitope: 

2 84 DKEKKEDADSQGESKNAETSDKDSSDKDAPEGSNEIE 
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Figure 31: CPN100426 



1 MTIRVRNLAY SVNKKKILDG VTFSLERGHI TLFVGKSGSG KTMILRALAG 
51 LVQPTQGDIW IEGEAPALVF QQPELFSHMT VLGNCTHPQI HIKGRSTEEA 
101 REKAFELLHL LDIEEVAKNY PDQLSGGQKQ RVAIVRSLCM DKHTLLFDEP 
151 TSALDPFATA SFRHLLETLR DQELTVGLTT HDMQFVHSCL DRIYLIDQGT 
2 01 VAGVYDKRDG ELDSGHPLSK YIHSAQ 



Possible T cell epitope: 
145 LLFDEPTSA 
Possible B cell epitope: 
2 05 YDKRDGE 
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Figure 32: CPN100508 

1 MKRPFFTYLC IIFYGSCASL 
51 DIDAVIYPAS MTKIATALFI 
101 GYRSPPHWLE TDGSTIQLHL 
151 GSVEKFMDKL NFFLKEEIGC 
201 LKEPPFRGVI STTSYKIGAT 
251 TGTTKTAGKN LIMAAEKNNR 
3 01 PLLRKELVPP SDCLQLEIAN 
351 AHADAFPIEQ GDLLGHWVFY 
401 RVFTSYRTYM SITMLLMYFR 



SLHAGLSFPE VRGATAAWH ADSGKVFYDK 
LKHYPTVLDT LIKVKQDAIA SITPQAKKQS 
REELLGWDLF HALLVCSAND AANVLAMACC 
THTHFNNPHG LHHPNHYTTT RDLISIMRCA 
NLHGERILSP TNKLLLPGST YHYPPALGGK 
LLVTIATGYS GPVSDLYQDV IALCETVFNE 
LGKLSCPLPE GLYYDFYASE DREPLSVSFI 
DDEGKKISSQ PFYAPCRFER TIKPWKLYMK 
IRKHRKYKNL KHYSKI 



Possible T cell epitope: 
156 FMDKLNFFL 
Possible B cell epitope: 
422 RKHRKYKN 
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Figure 33: CPN100515 



1 MASNPILQIE DLSITLAKQR QQYPIVQSLS FTINEGQTLA IIGESGSGKS 
51 VSAHAILRLL PCPPFSVSGQ VNFQGHNLLT ASRSIQKKII GTEISMIFQN 
101 PQASLNPVFT IEQQFREIIH THLALTAEVA KEKMLYALEE TGFHDPRLCL 
151 NLYPHQLSGG MLQRICIAMA LLCSPKLLIA DEPTTALDVS VQYQILQLLK 

2 01 TLQKKTGMSL LIITHNMGW AETADDVLVL YAGRMVECAP AVQMFHNPSH 
251 PYTRDLLASR PSLQPQQLGS FNPIPGQPPH YTAFPSGCRY HPRCSKILNR 

3 01 CSAEAPEIYP VREGHKVRVG CMTTNFPQPL IQATSLTKHY YKRSFWFQGK 
351 TIASRPVDDV SFSLYSRRAV GLIGESGSGK STLALALAGL LPLTSGFLTF 
401 NGTPIKLHSK HGRHQLRSQV RLVFQNPQAS LNPRKTILDS LGHSLLYHKL 
451 VP KE KVLAT V REYLELVGLS EEYFYRYPHQ LSGGQQQRVS IARALLGVPQ 
501 LIICDEIVSA LDLSIQAQIL NMLAELQKKL SLTYLFISHD LAWRSFCTE 
551 VFIMYKGQIV EKGNTKRIFS DPQHPYTRML LNAQLPETPD QRQSKPIFQE 
601 YHKDSEESCS TGCYFYNRCP QKQEACKSEI IPNQGDAHHT YRCIH 



Possible T cell epitope: 

59 LLPCPPFSV 

Possible B cell epitopes: 

18 KQRQQY 
587 ETPDQRQSK 
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Figure 34: CPN100538 

1 MPGIEKAATT VAVPQDKSEE EKVKERLTKR ELTCEDLKDN GYTVNFEDIS 
51 ILELLQFVSK ISGTNFVFDS NDLQFNVTIV SHDPTSVDDL STILLQVLKM 
101 HDLKWEQGN NVLIYRNPHL SKLSTWTDS SLKETCEAW VTRVFRLYRR 
151 QPSAAVNIIQ PLLSHDAIVS ASEATRHVII SDIAGNVDKV SDLLAALDCP 
201 GTSVDMTEYE VKYANPAALV SYCQDVLGTL AEDDAFQMFI QPGTNKIFW 
251 SSPRLANKAE QLLKSLDVPE MAHTLDDPAS TALALGGTGT TSPKSLRFFM 
3 01 YKLKYQNGEV IANALQDIGY NLYVTTAMDE DFINTLNSIQ WLEWNSIVI 
351 IGNQGNVDRV IGLLNGLDLP PKQVYIEVLI LDTSLEKSWD FGVQWVALGD 
401 EQSKVAYASG LLNNTGIATP TKATVPPGTP NPGSIPLPTP GQLTGFSDML 
451 NSSSAFGLGI IGNVLSHKGK SFLTLGGLLS ALDQDGDTVI VLNPRIMAQD 
501 TQQASFFVGQ TVPYQTIKYY IQETGTVTQN IDYEDIGVNL WTSTVAPNN 
551 WTLQIEQTI SELHSASGSL TPVTDKTYAA TRLQIPDGCF LVMSGHIRDK 
601 TTKWSGVPL LNSIPLIRGL FSRTIDQRQK RNIMMFIKPK VISSFEEGTR 
651 VTNKEGYRYN WEADEGSMQV APRHAPECQG PPSLQAESDF KIIEIEAQ 



Possible T cell epitope: 

5 0 SILELLQFV 

Possible B cell epitopes: 

15 QDKSEEEK 
626 DQRQKRN 
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Figure 35: CPN10 0557 

1 MSRKDNEVSL ARSIFNILSG TFCSRITGIF REIAMATYFG ADPIVAAFWL 

51 GFRTVFFLRK ILGGLILEQA FIPHFEFLRA QSLDRAAFFF RRFSRLIKGS 

101 TIIFTLLIEA VLWVFFNNVE EGTYDMILLT MILLPCGIFL MMYNVNGALL 

151 HCGNKFFGVG LAPWVNIIW IFFVIAARHS DPRERIIGLS VALVIGFFFE 

2 01 WLITVPGWK FLLEAKSPPQ EHDSVRALLA PLSLGILTSS IFQLNLLSDI 

251 CLARYVHEIG PLYLMYSLKI YQLPIHLFGF GVFTVLLPAI SRCVQREDHE 

301 RGLKLMKFVL TLTMSVMIIM TAGLLLLALP GVRVLYEHGL FPQSAVYAIV 

351 RVLRGYGAS I IPMALAPLVS VLFYAQRQYA VPLFIGIGTA LANIVLSLVL 

401 GRWVLKDVSG ISYATSITAW VQLYFLWYYS SKRLPMYSKL LWESIRRSIK 

451 VMGTTMLACM ITLGLNILTQ TTYVIFLNPL TPLAWPLSSI TAQAIAFLSE 

501 SCIFLAFLFG FAKLLRVEDL INLASFEYWR GQRGLLQRQH VMQDTQN 



Possible T cell epitope: 

111 VLWVFFNNV 

Possible B cell epitopes: 

1 MSRKDNE 

2 95 QREDHERG 
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Figure 36: CPN100622 

1 MKTSRNKQCK ITDPLSKSSF FVGALILGKT TILLNATPLS DYFDNQANQL 
51 TTLFPLIDTL TNMTPYSHRA TLFGVRDDTN QDIVLDHQNS IESWFENFSQ 
101 DGGALSCKSL AITNTKNQIL FLNSFAIKRA GAMYVDGNFD LSENHGSIIF 
151 SGNLSFPNAS NFADTCTGGA VLCSKNVTIS KNQGTAYFIN NKAKSSGGAI 
2 01 QAAIINIKDN TGPCLFFNNA AGGTAGGALF ANACRIENNS QPIYFLNNQS 
251 GLGGAIRVHQ ECILTKNTGS VIFNNNFAME ADISANHSSG GAIYCISCSI 
301 KDNPGIAAFD NNTAARDGGA ICTQSLTIQD SGPVYFTNNQ GTWGGAIMLR 
351 QDGACTLFAD QGDIIFYNNR HFKDTFSNHV SVNCTRNVSL TVGASQGHSA 
401 TFYDPILQRY TIQNSIQKFN PNPEHLGTIL FSSTYIPDTS TSRDDFISHF 
451 RNHIGLYNGT LALEDRAEWK VYKFDQFGGT LRLGSRAVFS TTDEEQSSSS 
501 VGSVININNL AINLPSILGN RVAPKLWIRP TGSSAPYSED NNPIINLSGP 
551 LSLLDDENLD PYDTADLAQP IAEVPLLYLL DVTAKHINTD NFYPEGLNTT 
601 QHYGYQGWS PYWIETITTS DTSSEDTVNT LHRQLYGDWT PTGYKVNPEN 
651 KGDIALSAFW QSFHNLFATL RYQTQQGQIA PTASGEATRL FVHQNSNNDA 
701 KGFHMEATGY SLGTTSNTAS NHSFGVNFSQ LFSNLYESHS DNSVASHTTT 
751 VALQINNPWL QERFSTSASL AYSYSNHHIK ASGYSGKIQT EGKCYSTTLG 
801 AALSCSLSLQ WRSRPLHFTP FIQAIAVRSN QTAFQESGDK ARKFSVHKPL 
851 YNLTVPLGIQ SAWESKFRLP TYWNIELAYQ PVLYQQNPE I NVSLESSGSS 
901 WLLSGTTLAR NAIAFKGRNQ IFIFPKLSVF LDYQGSVSSS TTTHYLHAGT 
951 TFKF 



Possible T cell epitope: 

119 ILFLNSFAI 

Possible B cell epitopes: 

2 KTSRNKQ 
647 NPENKG 
694 QNSNNDAK 
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Figure 37: CPN10 0 62 6 

1 MQVFPKVTLS LDYSADISSS TLSHYLNVAS RMRFLTISDQ NRKIKEPLVS 
51 KTPPKFLFYL GNFTACMFGM TPAVYSLQTD SLEKFALERD EEFRTSFPLL 
101 DSLSTLTGFS PITTFVGNRH NSSQDIVLSN YKSIDNILLL WTSAGGAVSC 
151 NNFLLSNVED HAFFSKNLAI GTGGAIACQG ACTITKNRGP LIFFSNRGLN 
201 NASTGGETRG GAIACNGDFT ISQNQGTFYF WNSWNWGG ALSTNGHCRI 
251 QSNRAPLLFF NNTAPSGGGA LRSENTTISD NTRPIYFKNN CGNNGGAIQT 
301 SVTVAIKNNS GSVIFNNNTA LSGSINSGNG SGGAIYTTNL SIDDNPGTIL 
351 FNNNYCIRDG GAICTQFLTI KNSGHVYFTN NQGNWGGALM LLQDSTCLLF 
401 AEQGNIAFQN NEVFLTTFGR YNAIHCTPNS NLQLGANKGY TTAFFDPIEH 
451 QHPTTNPLIF NPNANHQGTI LFSSAYIPEA SDYENNFISS SKNTSELRNG 
501 VLSIEDRAGW QFYKFTQKGG ILKLGHAASI ATTANSETPS TSVGSQVIIN 
551 NLAINLPSIL AKGKAPTLWI RPLQSSAPFT EDNNPTITLS GPLTLLNEEN 
601 RDPYDSIDLS EPLQNIHLLS LSDVTARHIN TDNFHPESLN ATEHYGYQGI 
651 WSPYWVETIT TTNNASIETA NTLYRALYAN WTPLGYKVNP EYQGDLATTP 
701 LWQSFHTMFS LLRSYNRTGD SDIERPFLEI QGIADGLFVH QNSIPGAPGF 
751 RIQSTGYSLQ ASSETSLHQK ISLGFAQFFT RTKEIGSSNN VSAHNTVSSL 
801 YVELPWFQEA FATSHSLAYG YGDHHLHAYI RHIKNRAEGT CYSHTLAAAI 
851 GCSFPWQQKS YLHLSPFVQA IAIRSHQTAF EEIGDNPRKF VSQKPFYNLT 
901 LPLGIQGKWQ SKFHVPTEWT LELSYQPVLY QQNPQIGVTL LASGGSWDIL 
951 GHNYVRNALG YKVHNQTALF RSLDLFLDYQ GSVSSSTSTH HLQAGSTLKF 



Possible T cell epitope: 

56 FLFYLGNFT 

Possible B cell epitopes: 

3 9 DQNRKIK 
597 NEENRDPYD 
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Figure 38: CPN10 0 62 8 

1 MLLPFTFVLA NEGLQLPLET YITLSPEYQA APQVGFTHNQ NQDLAIVGNH 
51 NDFILDYKYY RSNGGALTCK NLLISENIGN VFFEKNVCPN SGGAIYAAQN 
101 CTISKNQNYA FTTNLVSDNP TATAGSLLGG ALFAINCSIT NNLGQGTFVD 
151 NLALNKGGAL YTETNLSIKD NKGPIIIKQN RALNSDSLGG GIYSGNSLNI 
201 EGNSGAIQIT SNSSGSGGGI FSTQTLTISS NKKLIEISEN SAFANNYGSN 
251 FNPGGGGLTT TFCTILNNRE GVLFNNNQSQ SNGGAIHAKS IIIKENGPVY 
301 FLNNTATRGG ALLNLSAGSG NGSFILSADN GDIIFNNNTA SKHALNPPYR 
351 NAIHSTPNMN LQIGARPGYR VLFYDPIEHE LPSSFPILFN FETGHTGTVL 
401 FSGEHVHQNF TDEMNFFSYL RNTSELRQGV LAVED G AGL A CYKFFQRGGT 
451 LLLGQGAVIT TAGTIPTPSS TPTTVGSTIT LNHIAIDLPS ILSFQAQAPK 

5 01 IWIYPTKTGS TYTEDSNPTI TISGTLTLRN SNNEDPYDSL DLSHSLEKVP 
551 LLYIVDVAAQ KINSSQLDLS TLNSGEHYGY QGIWSTYWVE TTTITNPTSL 

6 01 LGANTKHKLL YANWSPLGYR PHPERRGEFI TNALWQSAYT ALAGLHSLSS 
651 WDEEKGHAAS LQGIGLLVHQ KDKNGFKGFR SHMTGYSATT EATSSQSPNF 
701 SLGFAQFFSK AKEHESQNST SSHHYFSGMC IAKYSLQRVI RLSVSLAYMF 
751 TSEHTHTMYQ GLLEGNSQGS FHNHTLAGAL SCVFLPQPHG ESLQIYPFIT 
8 01 ALAIRGNLAA FQESGDHARE FSLHRPLTDV SLPVGIRASW KNHHRVPLW 
851 LTEISYRSTL YRQDPELHSK LLISQGTWTT QATPVTYNAL GIKVKNTMQV 
901 FPKVTLSLDY SADISSSTLS HYLNVASRMR F 



Possible T cell epitope: 

1 MLLPFTFVL 

Possible B cell epitopes: 

3 8 HNQNQ 
619 YRPHPERRG 
669 HQKDKNG 
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Figure 39: CPN100630 

1 MPLSFKSSSF CLLACLCSAS CAFAETRLGG NFVPPITNQG EEILLTSDFV 
51 CSNFLGASFS SSFINSSSNL SLLGKGLSLT FTSCQAPTNS NYALLSAAET 
101 LTFKNFSSIN FTGNQSTGLG GLIYGKDIVF QSIKDLIFTT NRVAYSPASV 
151 TTSATPAITT VTTGASALQP TDSLTVENIS QSIKFFGNLA NFGSAISSSP 
201 TAWKFINNT ATMSFSHNFT SSGGGVIYGG SSLLFENNSG CIIFTANSCV 
251 NSLKGVTPSS GTYALGSGGA ICIPTGTFEL KNNQGKCTFS YNGTPNDAGA 
3 01 IYAETCNIVG NQGALLLDSN TAARNGGAIC AKVLNIQGRG PIEFSRNRAE 
351 KGGAIFIGPS VGDPAKQTST LTILASEGDI AFQGNMLNTK PGIRNAITVE 
401 AGGEIVSLSA QGGSRLVFYD PITHSLPTTS PSNKDITINA NGASGSWFT 
451 SKGLSSTELL LPANTTTILL GTVKIASGEL KITDNAWNV AGFATQGSGQ 
5 01 LTLGSGGTLG LATPTGAPAA VDFTIGKLAF DPFSFLKRDF VSASVNAGTK 
551 NVTLTGALVL DEHDVTDLYD MVSLQSPVAI PIAVFKGATV TKTGFPDGEI 
601 ATPSHYGYQG KWSYTWSRPL LIPAPDGGFP GGPSPSANTL YAVWNSDTLV 
651 RSTYILDPER YGEIVSNSLW ISFLGNQAFS DILQDVLLID HPGLSITAKA 
701 LGAYVEHTPR QGHEGFSGRY GGYQAALSMN YTDHTTLGLS FGQLYGKTNA 
751 NPYDSRCSEQ MYLLSFFGQF PIVTQKSEAL ISWKAAYGYS KNHLNTTYLR 
8 01 PDKAPKSQGQ WHNNSYYVLI SAEHPFLNWC LLTRPLAQAW DLSGFISAEF 
851 LGGWQSKFTE TGDLQRSFSR GKGYNVSLPI GCSSQWFTPF KKAPSTLTIK 
901 LAYKPDIYRV NPHNIVTWS NQESTSISGA NLRRHGLFVQ IHDWDLTED 
951 TQAFLNYTFD GKNGFTNHRV STGLKSTF 



Possible T cell epitope: 

93 6 GLFVQIHDV 

Possible B cell epitopes: 

2 81 KNNQGK 
345 SRNRAEK 
707 HTPRQGHE 
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